- From: Sandro Hawke <sandro@w3.org>
- Date: Tue, 14 Feb 2012 23:05:31 -0500
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: public-rdf-dawg@w3.org
On Tue, 2012-02-14 at 22:19 +0000, Andy Seaborne wrote: > > On 14/02/12 15:56, Sandro Hawke wrote: > > Looking more closely, it's not 5.8 that I want back, it's this sentence: > > > > Within a service description document for an implementation of > > this protocol, the object of an sd:defaultDataset statement is > > understood to be the identifier of the Graph Store > > Where do you expect to read the service description from? > > Could you write out a concrete example, with URIs and actions, so I can > understand the process you are envisaging that is behind your comments? > I'm quite confused as to the information flow you are looking for. Imagine a national government wants data feeds about water quality from each of its regional governments. Each region is responsible for running a SPARQL endpoint serving the data, broken up into a different graph for each km^2 and month. In the default graph is to be metadata about each of those other graphs, saying when it's valid and what area it covers. Now, the national government collects the endpoint addresses, one per region, looking something like this: http://northeast-region.example.gov/wqdat/sparql http://northwest-region.example.gov/~smith/fed/sparql http://northcentral.example.gov/water/sparql Following normal SPARQL practice, some of the regions pick graph names which are not actually working URLs which can be used to fetch the associated data. Instead one region use tag URIs, one uses UUIDs, one uses the URI of the most prominent geographic feature in the block, and another uses a homegrown URI scheme which produces URIs like this: block:34.2234-34.2547,81.3331,80.9830:2010-01-01 This all works. Given this list of SPARQL endpoints, the nation govt can write various clients which query each region's data as necessary. They can also publish this list of endpoint addresses, and let the general public query as they will. But there are some things we'd like to be able to do that we can't: * Alice wants to download all the graphs concerning a certain area and time-range, crossing several regions, without knowing SPARQL. She just wants a REST interface for GET'ing the default graphs and then the other data graphs. * Bob is doing analysis for which he needs to provide provenance. He wants a single URI for each of the graphs he's using, so he can put it into the "source" field for that part of the analysis. * Charlie is on a data-quality crusade. He's getting people to double check the data against other private data sources and their own experience. He's built a system for flagging questionable blocks of data, and even submitting corrections (patches). For this system, he needs some way to refer to each graph which has been flagged for correction. I think the simplest solution would be to just let everyone know they can always use: ${endpoint_addr}?graph=${graph_name} or ${endpoint_addr}?default as a URI for the indicated graph. I'd hope most endpoints would implement at least HTTP GET on those addresses, if not the whole GSP. Even if they just having this convention -- with no code changes -- it would address Bob and Charlie's problems. And Alice will know what to try, in case GSP happens to be implemented. Alternatively, if for some reason the SPARQL WG is not okay with using the endpoint address this way, we could use Service Description, as was in GSP until the most recent change [1]. With this, to get the URI of the default graph for the first region, Alice would: 1. GET http://northeast-region.example.gov/wqdat/sparql and get back a SPARQL service description that includes triples like this: @prefix sd: <http://www.w3.org/ns/sparql-service-description#> . <> a sd:Service; sd:defaultDataset <http://northeast-region.example.gov/wqdat/dataset> . 2. Given this, and the text that used to be GSP, plus what's still there, Alice knows the URL of the default graph for the northeast region is: http://northeast-region.example.gov/wqdat/dataset?default She can do a GET on this to get the contents of the default graph, which has something like this: <urn:uuid:eee02beb-eca7-4cb7-839c-9fc6206caae0> geo:lon0 34.2234; geo:lon1 34.2547; geo:lat1 81.3331; geo:lat0 80.9830; dc:temporal "2010-01-01"^xs:datetime. 3. Now she can construct a URL from which she can fetch the data for that region and that time, like this: http://northeast-region.example.gov/wqdat/dataset?graph=urn:uuid:eee02beb-eca7-4cb7-839c-9fc6206caae0 And that's about it. Repeat 3 for each block in the region; repeat 1-2 for each region. -- Sandro [1] https://cvs.w3.org/Team/~checkout~/WWW/2009/sparql/docs/http-rdf-update/Overview.html?rev=1.81;content-type=text%2Fhtml#http-post and scroll down toward the end of that POST section.
Received on Wednesday, 15 February 2012 04:05:39 UTC