Re: GET on a graph store URI / Graph Store HTTP Protocol from Sandro Hawke on 2011-12-19 (public-rdf-dawg@w3.org from October to December 2011)

From: Sandro Hawke <sandro@w3.org>
Date: Mon, 19 Dec 2011 08:43:12 -0500
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-rdf-dawg@w3.org
Message-ID: <1324302192.6252.1550.camel@waldron>
On Mon, 2011-12-19 at 11:24 +0000, Andy Seaborne wrote:
> 
> On 17/12/11 05:44, Gregory Williams wrote:
> > On Dec 17, 2011, at 12:05 AM, Sandro Hawke wrote:
> >
> >>> 2/ Could we allow GET on a graph store URI return quads?
> >>
> >> I think so, yes.
> >>
> >>
> >> There are two very different things people might reasonably want to do
> >> here: get a dump of the dataset, and get a dump of the URLs used to
> >> access the graphs in the dataset.   I think at some point both need to
> >> be supported, but it looks like we have that with SD.
> >>
> >> As I understand it, given a SPARQL endpoint with address E:
> >>
> >>    GET E
> >>    ... returns the Service Description (SD)
> 
> This conflates service description and graph store.
> 
> Describes a service => E is a service (endpoint).
> 
> >>    Query SD for { ?S sd:endpoint<E>; sd:defaultDataset ?D }
> >>
> >>    GET ?D
> >>    ... should return a TriG/N-Quads serialization of the given dataset;
> >
> > We agree that this is at odds with the GSHP text as it currently stands, right?
> 
> It is certainly a different URI to <graphStore>.  A URI denotes a 
> resource.  <foo?queryString> is a resource that is the result of the 
> question in the query string.  There is nothing special about 
> ?queryString in RFC 3986 (see section 3.4) in terms of resource 
> identification.
> 
> So
> 
> GET http://server/graphStore
> 
> either
>    => quads
> or
>    => a description of the graph store (e.g. contents.)
> 
> GET http://server/graphStore?serviceDescription
>    => service description of the graph store.

Note that in the current draft we have, I think:  

1.   The endpoint address, like http://example.com/people/sparql.  This
is normal SPARQL.

  - GET on this gives you the SD
  - Query parameters are added to this for SPARQL queries

2.   The "graphstore" address, which is also described as the "default
dataset" URI in the SD, like http://example.com/people.   This seems
fairly reasonable to me, since graphstore::dataset : g-box::graph. 

   - POST on this is defined as creating a new resource at a different
address, a behavior I think everyone agrees is good.  Interestingly,
it's the one thing clients can do using the graphstore HTTP protocol
that they can't do with SPARQL Query + Update.
   - GET and PUT on this are undefined in the current drafts, but would
logically be used with a multigraph format to dump/replace the dataset.

3.   The graphstore service address. There is no way to determine this
from any of the other URLs.  ("the service URL will need to be known a
priori")   This might look like http://example.com/people/service.  It
is used just for indirect graph reference, to give working URLs to
graphs in the dataset which don't have URLs served by this server.  This
syntax is used in all the examples in the spec (somewhat confusingly);
it gives URLs like this:

   .../people/service?graph=http%3A//www.example.com/other/graph

  - As I suggested in other email ([1], in which I mistakenly confused
the service address with the dataset address), I think we should get rid
of this concept, and instead have something in the SD which tells you
the prefix on which to attach the URL-quoted graph name to make its
restful URL, something like:

    { ?srv sd:endpoint $ENDPOINTADDRESSS; sd:graphHostingPrefix ?pfx }

Alternatively, I think the best solution is to say the endpoint address
*is* the service address, in this sense.   I might say that like this: 

        The 'graph' URL parameter is used with the SPARQL endpoint
        address to construct a URL for direct (RESTful) access to the
        graphs in the endpoint's default dataset.   Endpoints MUST
        recognize this keyword and SHOULD implement the Graphstore HTTP
        Protocol using these URLs; if they do not, they MUST return 501
        Not Implemented on requests to these URLs.
        
That last bit allows clients to safely try it.    Think this text would
have to go in one of the other documents, since it applies to folks NOT
implementing the Graphstore HTTP Protocol.

Telecon to talk about this stuff in 78 minutes.   Time for some
breakfast.

    -- Sandro

 
> >
> >>      (but doesn't with the current GSHP)
> >>    PUT ?D<somedata>
> >>    ... should replace the dataset
> >
> > I'm not sure we've ever really nailed down whether this should work. ?D here is the resource for the default dataset, which I wouldn't assume is necessarily the same thing as the graph store. For that matter, ?D might not even be a dereferenceable IRI.
> >
> >>         Query SD for { ?D sd:namedGraph/sd:name ?N } and ?N comes back
> >>         with each of the graph URLs.
> >>
> >> I don't suppose we can require SPARQL 1.1 endpoints to answer queries
> >> about their own SD, maybe using FROM their own endpoint address...?
> >> Otherwise, what, you need your own SPARQL server before you can start to
> >> poke at someone else's?
> >
> > We had early discussions about this while deciding how a service description was to be made available. I think it would be nice for systems to be able to do this (you get it for free on systems that dereference FROM IRIs), but I wouldn't want to mandate it. Also, I don't think you need a full SPARQL implementation to benefit from a service description. A simple triple store would allow you to access the data you've discussed above with simple triple pattern matching and a couple of loops.
> >
> > .greg
> 
>  Andy
> 
> 
>
Received on Monday, 19 December 2011 13:43:29 UTC