Re: [ISSUE-30] Suggestions for HTTP protocol updates from Steve Harris on 2009-06-03 (public-rdf-dawg@w3.org from April to June 2009)

From: Steve Harris <steve.harris@garlik.com>
Date: Wed, 3 Jun 2009 10:37:25 +0100
To: Chimezie Ogbuji <ogbujic@ccf.org>
Cc: "Kjetil Kjernsmo" <Kjetil.Kjernsmo@computas.com>, public-rdf-dawg@w3.org
Message-Id: <CF3D0938-2646-4C30-8260-ED3355670A8E@garlik.com>
On 2 Jun 2009, at 22:29, Chimezie Ogbuji wrote:

> On 6/2/09 4:34 PM, "Kjetil Kjernsmo" <Kjetil.Kjernsmo@computas.com>  
> wrote:
>> All,
>> For those who didn't make it to todays telecon, we opened a new  
>> issue today:
>> http://www.w3.org/2009/sparql/track/issues/30
>> about using HTTP for graph updates.
>>
>> I'll try to clarify some ideas I've had around using the HTTP  
>> protocol and
>> REST principles, and hopefully also summarize the options that we  
>> have, but
>> not authoritatively.
>>
>> I think having a protocol is important because it allows clients to  
>> know
>> nothing more than HTTP. It is also important because it is really  
>> simple to
>> configure a server to support just GET, PUT, POST and DELETE  
>> operations, and I
>> think a server that does just that is as important as a server that  
>> runs a
>> SPARQL endpoint in the background.
>
> +1 In addition this begs the question about 'compliance' levels.   
> I.e., is
> such a server considered a 'SPARQL protocol service'?
>
>> So, the simplest protocol is one where the HTTP Request-URI is the  
>> graph URI.
>
> Or perhaps a portion of the request URI is the graph URI?  Consider  
> Dave
> Beckett's triplr service, which takes GET requests against the  
> following
> URI:
>
> http://triplr.org/turtle/www.kanzaki.com/works/

It also accepts the somewhat more explicit http://triplr.org/turtle/http://www.kanzaki.com/works/

Which is how we address graphs currently in Garlik.

> And interprets it as a request to extract RDF from
> http://www.kanzaki.com/works/
>
> At least, this gets around the challenge of not having the necessary
> 'control' over web space to ensure that all the graph names in the  
> dataset
> resolve to endpoint locations (the *same* endpoint location?) that  
> know how
> to interpret PUT application/rdf+xml, etc. requests in this way.
>
> I can imagine a perma-thread regarding how 'RESTful' such an  
> approach is.  I
> think, generally, we should be careful about what we mean when we  
> say REST
> principles because there are some REST 'principles' that have more  
> to do
> with style than anything else (PUT vs. POST comes to mind as the  
> favorite
> topic for perpetual redux threads).
>
> For me, I think of following RESTful principles as simply meaning we  
> should
> design web interfaces that behave in harmony with the HTTP  
> specification
> such that any client (or developer of a client) can reasonably  
> determine how
> to use the interface (mostly) with guidance from HTTP in addition to  
> having
> other characteristics such as statelessness and cacheability.
>
>> If a client PUTs triples to a graph URI, a subsequent GET will  
>> return the same
>> triples. A POST will add the triples in the message to the graph at  
>> the graph
>> URI and DELETE will remove the triples from the graph at the graph  
>> URI.
>>
>> Now, this has the obvious shortcoming that it cannot be used in a  
>> case where
>> you don't control the graph URI, so you can't make it identical to  
>> the
>> Request-URI. That's where the proposal to create use URIs like
>> http://endpoint/rest/?graph=http://foo.com comes in, as this can  
>> serve this
>> important case.
>
> Yes, this is another alternative.  For me, the difference between this
> approach (using request parameters appended to the request URI) and  
> the one
> above is a wash and comes down to style preference.

Indeed. After doing some superficial research it appears that the ?  
form of PUT is not widely used, so on that basis I prefer something  
that looks like a non-CGI URI. With, or without the second protocol  
specifier.

>> However, I also note that this could be achieved by using the  
>> language, and is
>> as simple as INSERT DATA INTO <http://foo.com> { <triples> <go>  
>> <here> . } So,
>> whether it is worth it must be discussed. Also, it isn't quite  
>> clear to me if
>> this is entirely RESTful.
>
> You *might* find Mark Bakers opinions on this matter insightful:
>
> http://www.markbaker.ca/blog/2006/11/the-trouble-with-binding/

Also, it doesn't meet the same need.

being able to do
   $ curl -T data.rdf http://endpoint.example/data/data.rdf
or equivalent is important, to us at least.

I can imagine command likes that will mutate some RDF file into an  
INSERT DATA INTO type expression, but it's going to be ugly and verbose.

> [[[
> It does respect Web architecture, but only because it’s read-only.  
> As soon
> as you need to add mutation support, or indeed any other operation  
> on the
> same resource, the process fails and what results is not Web- 
> friendly. This
> is because “operation on the same resource” doesn’t work if the  
> operation is
> part of the resource name; if the operation changes, the name  
> changes, and
> therefore the resource-identified changes.
> ]]]
>
> It might be useful to tease out this criticism a bit.  However, I  
> have a
> hard time understanding the explicit *cost* (besides the blanket  
> statement
> of saying it is not 'RESTful') with doing things this way.  He  
> mentions that
> this mostly has to do with resource identification, which (in our  
> case) is
> inherently problematic since there is already some built-in  
> disconnect in
> the fact that RDF dataset Graph URIs are really just names (and thus  
> don't
> seem to have de-referenceability in mind).  Perhaps this is what  
> Andy (or
> was it SteveH?) was hinting at when he suggested we may need a more  
> robust
> definition of what an RDF dataset is at least with respect to web
> architecture.  How can named RDF graphs in a dataset be referenced  
> outside
> of a SPARQL query?

Quite, this is the issue we would address by defining some solution to  
ISSUE-30.

- Steve

-- 
Steve Harris
Garlik Limited, 2 Sheen Road, Richmond, TW9 1AE, UK
+44(0)20 8973 2465  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10  
9AD
Received on Wednesday, 3 June 2009 09:37:59 UTC