RE: SPARQL best practice for egov?

Richard,
Re: This option is not feasible IMO because it requires a major revamp
of  the organization's web publishing infrastructure.

I have issues with this position from 2 perspectives:

1 - Technical: We have implemented this in our open source EKB
(modeldriven.org) on top of existing triple stores and spargl engines.
The developer is quite talented but I don't think it was a huge
challenge.  Redirecting queries is not very hard.

2- Community: We are proposing a supposedly simple approach to a global
data grid.  Such a proposal will only be accepted if it works well, is
very simple to use and does not have major usability holes (such as the
poor relationship between a URI and a query point).  The user
perspective must take priority over legacy implementation patterns and
sunk investment of vendors.  This technology is at its infancy and must
not be crippled by a such a tactical perspective, least it fail - and
currently failure is an option.  Complex and expensive algorithms to
follow a link are unacceptable.

-Cory

-----Original Message-----
From: Richard Cyganiak [mailto:richard@cyganiak.de] 
Sent: Monday, April 26, 2010 12:29 PM
To: Cory Casanave
Cc: public-egov-ig IG
Subject: Re: SPARQL best practice for egov?

Hi Cory,

On 23 Apr 2010, at 09:16, Cory Casanave wrote:
> We had some discussion about the relationship between a RDF URL and a
> SPARQL endpoint as well as other resources such as graph metadata.   
> The
> conclusion seemed to be that there are various vocabularies where a
> triple in the graph could point to the endpoint that points to  
> metadata.
> The issue with this is that you would then have to get the entire  
> graph
> to get this one triple - which kind of missed the point if you have a
> large dataset that you want to query instead of download.
>
> I can imagine two conventions that could help solve this:
>
> 1) That every resource should respond to a SPARQL endpoint.  This  
> would
> then allow you to query that one resource directly to subset the data
> and/or to get the triple that points to metadata.

This option is not feasible IMO because it requires a major revamp of  
the organization's web publishing infrastructure. Typically, a SPARQL  
endpoint, if provided at all, is located on a different server and is  
architecturally separate from the rest of the web server. What you are  
asking for requires completely new server software, and I'm not aware  
of a single product that currently implements this.

> 2) That a standard manipulation is done on a URI to get metadata about
> resources, which would include the query point.  For example:
> http://www.example.com/rdf/people.rfd#cory could have metadata at
> http://www.example.com/rdf/metadata.rdf.  There are some existing
> solutions that use this approach.

The voiD [1] solution is to have a triple:

<http://www.example.com/rdf/people.rdf> void:inDataset
<http://www.example.com/rdf/metadata.rdf#Dataset 
 > .

Resolving <http://www.example.com/rdf/metadata.rdf> would yield a  
description of the datset, perhaps including a triple:

<http://www.example.com/rdf/metadata.rdf#Dataset> void:sparqlEndpoint
<http://www.example.com/sparql 
 > .

This exact problem is one of the use cases we had in mind when  
creating voiD. Implementation is reasonably straightforward, it  
requires publication of the <metadata.rdf> (or <void.ttl>) file, and  
one extra link in each RDF file.

Best,
Richard

[1] http://rdfs.org/ns/void-guide


>
>
>
> Can we set a "best practice" for open government data?  My preference
> would be the first.  Thoughts?
>
>
>
> -Cory
>

Received on Tuesday, 27 April 2010 19:31:11 UTC