Re: Questions and comments on SPARQL 1.1 Graph Store HTTP Protocol draft from Sandro Hawke on 2012-01-06 (public-rdf-dawg@w3.org from January to March 2012)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 06 Jan 2012 13:57:25 -0500
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-rdf-dawg@w3.org
Message-ID: <1325876245.2589.393.camel@waldron>
Meta comment -- is this something we want clarified in Graph Store
Protocol before it goes to LC?   There's a potentially useful behavior
here, and if we don't specify it, then folks may locally implement it in
various ways, and we won't be able to specify it in the future without
causing problems.   So, yeah, I think we should say the base URI for
POST-to-Create is the URI where the content ends up being published.
(Note that this the behavior you get if you just copy the POSTed bytes
to the new location.  No mangling is necessary.)

Responding to your specific points....

On Sat, 2011-12-24 at 16:58 +0000, Andy Seaborne wrote:
> 
> On 24/12/11 13:14, Sandro Hawke wrote:
> > On Thu, 2011-12-22 at 16:44 +0000, Andy Seaborne wrote:
> >> Re-directed to the working group list.
> >>
> >> On 22/12/11 16:01, Sandro Hawke wrote:
> >>>>> One convention we have adopted is to use the null relative URL (<>   in
> >>>>> Turtle, or "" in RDF/XML). I'd like to know whether there is a standard
> >>>>> way of doing this. Maybe the spec should indicate one.
> >>>>>
> >>>>
> >>>>
> >>>> This would require a base URI resolution mechanism that would (in the end) simply resolve this to the request-URI (per base URI resolution rules). The WG has decided to not support a base URI resolution mechanism for this specification. So, without a discovery mechanism the only way to perform this "append" behavior is to know the URI of the Graph Store before hand and to POST to it directly.
> >>>
> >>> So the suggested workaround is to POST to the graphstore to get a new
> >>> URL allocated, then PUT your graph to that new address.   And your
> >>> understanding is that the technique Arnaud is suggesting -- defining the
> >>> base URI as the URI that is allocated to hold the content -- could be
> >>> standardized in the future?
> >>
> >> Sandro,
> >>
> >> By your reading, does RFC 3986 "5.1.  Establishing a Base URI" and RFC
> >> 2616 defining a request URI apply here?
> >>
> >> If so, doesn't that say what the base URI is?
> >> If not, why don't these RFCs apply?
> >
> > Sorry, I know you asked this earlier, and in the crush of events I lost
> > track of it.
> >
> > As I read RFC 3986 section 5.1, it doesn't constrain the base URI for
> > the content of a POST.  So it falls back to the outer-ring, in that
> > diagram, of being application dependent.  That seems right to me -- in
> > this case, the application is whichever POST semantics the given
> > resource implements.  For POSTing to the graphstore itself (but not the
> > elements inside it), I believe we can specify the protocol for POST, and
> > saying how it handles relative URIs in RDF syntaxes seems to me like a
> > reasonable part of that.
> 
> [[
>   (5.1.2) Base URI of the encapsulating entity
>           (message, representation, or none)
> ]]
> the encapsulating entity is the HTTP request.

I don't think 5.1.2 applies here.  As far as I can tell, that text is
talking about a syntactic nesting, as in message headers.

> Even then:
> [[
>   (5.1.3) URI used to retrieve the entity
> ]]
> and "retrieve" can be read as a bias to GET. GET and PUT are a pair.

I think it's more than a bias.  Note the "if":

        if a URI was used to retrieve the representation, that URI shall
        be considered the base URI.

The spec is silent on the non-retrieval case.

I'd agree that for PUT, the same base-determination algorithm would
logically apply, but I don't see anything in the spec saying that.  
Nor that it applies to POST.

> There is nothing operation specific mentioned except a bias to 
> "retrieve" so if it applies to PUT, it applies to POST.

Actually, I could see it applying to the content that comes *back* from
a POST (so this isn't about HTTP verbs, exactly), but I don't see why it
would apply to the content sent.

The spec here matters most if it happens to have influenced code.  Do
you think there is a significant code base which assumes the base for
POST'd content is the URL the POST is sent to?   If so, do you think
that code base is relevant to the RDF case?   (I'd rather not have RDF
do something different from what XML or HTML folks do, but doing
something different might work as a compromise, if necessary.)    

The one place that occurs to me for this being widely deployed is in
ATOMPUB (RDF 5023).   Section 9.2 "Creating Resources with POST" doesn't
mention URI resolution.   This might be something we could discuss with
some ATOMPUB folks, if we really need compatibility.  If they all or
mostly implement the relative-to-where-it-ends-up semantics, would you
be comfortable with us specifying that?

> Test case:
> 
> ----------------------
> PUT http://example/foo/bar
> Content-type: text/turtle;charset=utf-8
> 
> @prefix : <#> .
> 
> <> :p 123 .
> ----------------------
> 
> What triple is that?

> GET http://example/foo/bar
> ==>
> ----------------------
> Content-type: text/turtle;charset=utf-8
> 
> @prefix : <#> .
> 
> <> :p 123 .
> ----------------------
> What triple is that?

For GET, and I think it's clear the base URI has to be the last URI used
to do the retrieval (or the Content-Location, if there is one, I guess).
For PUT I think we should define it to be that.   So the triple, if I'm
doing the mechanics right, is:

<http://example/foo/bar> <http://example/foo/bar#p> 123.
      
This might be the most interesting test case:

POST http://example/foo
(where this is a POST-to-Create resource, aka a Collection, aka a Graph
Store)
-------------
Content-type: text/turtle;charset=utf-8
 
@prefix : <#> .
 
<> <> :p 123 .
------------

And the response ends up posted at http://example/foo/bar.  I would like
that to be the same triple, too, and I don't *think* the specs or common
practice contradict us defining it that way.

 
> This is not what would we might like for the answer but what the specs 
> say. I agree it would be nice if container-add could have a base of the 
> "Location" but I don't think is what the RFCs say.
> 
> We can test this by looking at other (non-RDF) deployed usages.
> 
> minor - so how to refer to the container?
> "Location" does not have  <..../container/newthing_1>; it can be any URI.
> 
> > I don't see anything in 2616 that directly bears on this; perhaps I'm
> > missing something.
> 
> It defines the request URI.

I'm not really seeing that in RFC 3986 Section 5.1.     We have a
succession of 3 ways specified to find a base, none of which apply...

  1.  there's no base in the content
  2.  there's no encapsulating entity
  3.  there's no retrieval of the entity

... so we're left with the 4th, being application dependent.

> > The biggest stretch, I think, is how to talk about what is being
> > serialized in a POST.   Strictly speaking, it's not an RDF graph, since
> > it has relative URIs.  Logically, it's some kind of "relative RDF
> > graph", but that's not in the current RDF specs.   My sense is that
> > while it's not perfect, it would be acceptable for whatever spec says
> > this relative-URIs-in-POSTing is okay to also define this variation on
> > RDF graphs.
> 
> This is the part that would not worry me - the representation uses 
> relative URIs, but they are just a shorthand.  There is a base.

Ah, yes, make sense.  Good.

> > An alternative would be to define some well-known URI to use as the base
> > (eg "http://www.w3.org/2011/move-during-post/"), and the POST-handler
> > can rewrite those absolute URIs to new URIs.  But to me that seems less
> > attractive than just using relative URIs.
> 
> Agreed - in fact for some of the use cases beyond just store-retrieve 
> data, then the notion of a process applied to the data makes sense to 
> me.  It is not necessarily at odds with REST.  POST to a container could 
> be seen as a process action.

Yeah.  I notice that ATOMPUB bit is clear about how the server is free
to the change the POSTed data.   I'd like the default case, where the
server doesn't even change any of the bytes in the POSTed entity, to be
correct, for simple POST-to-create servers.

   -- Sandro

>  Andy
> 
>
Received on Friday, 6 January 2012 18:57:36 UTC