Re: [ISSUE-32] Implications of updates on protocol, regarding HTTP methods from Paul Gearon on 2009-07-29 (public-rdf-dawg@w3.org from July to September 2009)

From: Paul Gearon <gearon@ieee.org>
Date: Wed, 29 Jul 2009 14:48:56 -0500
To: Steve Harris <steve.harris@garlik.com>
Cc: public-rdf-dawg@w3.org
Message-ID: <a25ac1f0907291248q7f4a9da4j9a79ab6ec78f2063@mail.gmail.com>
On Wed, Jul 29, 2009 at 2:31 PM, Steve Harris<steve.harris@garlik.com> wrote:
> There is a subtlety here that's not been accounted for, PUT/DELETE and
> POST/GET are not operations at the same level. PUT and DELETE have fairly
> specific semantics whereas GET (with CGI arguments) and POST are transports
> for operations with their own semantics.
>
> Specifically, it is not normal to attach arguments to PUT or DELETE requests
> - the URI given after the verb, plus the headers are supposed to contain all
> the information. Eg. a DELETE to http://example.com/foo has a fairly
> specific meaning, whereas a POST to http://example.com/foo has no specific
> meaning on it's own.

I did mean to allude to this when I mentioned the URIs on a PUT being
associated with specific resource, but I should have been more
explicit and also mentioned the corresponding behavior with DELETE.
This is particularly apparent with REST, but is less clear in
"traditional" HTTP where web developers would choose somewhat
arbitrarily between PUT and POST.

> I personally feel that it would be a serious mistake to encourage
> SPARQL/Query requests to be sent as POST requests, it confuses caches (a
> selling point of SPARQL in enterprise environments) and gives a false
> impression of the scope of a SPARQL/Query operation.

This refers back to the original SPARQL/Protocol, which has already
defined this behavior. It may be poor practice, but it's in the spec,
and I doubt there is the desire to change this. Looking at when to use
GET vs POST [1], then I'd like to see SPARQL/Query all done as GET,
and SPARQL/Update done as POST. But I don't get to change
SPARQL/Query, so I'm trying to work around it.

As for GET requests that are too long, I'd prefer to see GET with a
message body. HTTP 1.1 does not prohibit this, though it is not common
practice. There aren't many conversations around it, but [2] seemed to
cover some of the issues.

[1]  http://www.w3.org/2001/tag/doc/whenToUseGet.html#checklist
[2]  http://dret.typepad.com/dretblog/2007/10/http-get-with-m.html

Regards,
Paul

> - Steve
>
> On 29 Jul 2009, at 19:05, Paul Gearon wrote:
>
>> This email disc`harges my action
>> http://www.w3.org/2009/sparql/track/actions/55
>>
>>
>> The initial SPARQL language and protocol (SPARQL/Query 1.0,
>> SPARQL/Protocol 1.0) both describe read-only operations, which left no
>> change of state on the server. SPARQL/Update is expected to use the
>> SPARQL/Protocol as well, however it is designed to modify state on the
>> server, which in turn has implications for the protocol.
>>
>> SPARQL/Protocol 1.0 interface describes describes bindings for SOAP[1]
>> and HTTP[2]. SOAP has no requirements on server state in response to
>> an operation, but HTTP does. Given that HTTP is such a commonly
>> implemented and used binding, this description will focus on
>> SPARQL/Protocol bound to HTTP.
>>
>> SPARQL/Protocol 1.0 defines the use of the GET and POST methods,
>> referred to as queryHttpGet and queryHttpPost respectively. No other
>> HTTP operations are described. queryHttpGet should be used in all
>> cases, except where the query exceeds practical limits, in which case
>> queryHttpPost is used, with the query provided in the body of the
>> request. In this way, queryHttpPost is being used as a fallback
>> operation for queryHttpGet, duplicating its functionality.
>>
>> RFC 2616 describes the HTTP GET method as "Safe", shown here from section
>> 9.1.1:
>>  "In particular, the convention has been established that the GET and
>>  HEAD methods SHOULD NOT have the significance of taking an action
>>  other than retrieval. These methods ought to be considered "safe".
>>  This allows user agents to represent other methods, such as POST, PUT
>>  and DELETE, in a special way, so that the user is made aware of the
>>  fact that a possibly unsafe action is being requested.
>>
>>  Naturally, it is not possible to ensure that the server does not
>>  generate side-effects as a result of performing a GET request; in
>>  fact, some dynamic resources consider that a feature. The important
>>  distinction here is that the user did not request the side-effects,
>>  so therefore cannot be held accountable for them."
>>
>>
>> SPARQL/Update operations are specifically designed to modify data on
>> the server, specifially:
>> * Create a graph
>> * Delete a graph
>> * Clear statements from a graph
>> * Create statements
>> * Delete statements
>>
>> The HTTP GET method is inappropriate for use with these operations as
>> they are all modifying operations, and are therefore "Unsafe". Some
>> other method is required to execute these operations via HTTP.
>>
>> It should be noted that the POST method has no restrictions on its
>> "Safety", so modifying operations are permitted with this method.
>>
>> Also of note is that many implementations extend SPARQL/Protocol 1.0
>> to provide read-write services. Some implement a REST interface to
>> provide the above actions through HTTP methods such as PUT and DELETE
>> (for instance, Sesame and Mulgara). Others accept commands in an
>> "Update language" such as HP's SPARQL/Update on the PUT and POST
>> methods (again, Sesame and Mulgara, among others). Note that the use
>> of POST in this context is not the same as described in
>> SPARQL/Protocol 1.0, as this protocol describes queryHttpPost while an
>> update operation is not a "query".
>>
>>
>> Options for modifying the existing protocol for SPARQL/Update:
>>
>> Option 1: Write an unrelated new protocol for the HTTP binding
>> SPARQL/Update to operate on.
>> This would appear to be duplicating some work, and still needs to
>> address how modifying operations need to be called.
>>
>> Option 2: All modifying operations go through POST.
>> While not mandated, the standard use of POST is to provide all data in
>> the body. This is how the query operation works. This may inconvenient
>> for applications that may want to execute a simple operation that can
>> be encapsulated in the URI.
>>
>> Option 3: All modifying operations go through PUT with a fallback to
>> POST for large commands.
>> This is similar to the definition of query which uses GET and POST.
>> However, this is awkward if doing a PUT or a POST for a command that
>> is trying to delete resources, such as triples or graphs, as the
>> expected semantics of these methods is to add data to a server. Also,
>> PUT is more tightly defined than POST, expecting a resource in the
>> URI, while a different resource MUST be referred to with a different
>> URI.
>>
>> Option 4: Use appropriate methods for each action.
>> This means using PUT to create resources, DELETE to remove them, GET
>> and HEAD to query them. However, this is what the REST protocol will
>> be doing, and makes the notion of a SPARQL/Update language
>> superfluous.
>>
>>
>> Option 2 appears to offer the least difficulty. Are other options
>> available?
>>
>> Regards,
>> Paul Gearon
>>
>> [1] http://www.w3.org/TR/rdf-sparql-protocol/#query-bindings-soap
>> [2] http://www.w3.org/TR/rdf-sparql-protocol/#query-bindings-http
>> [3] http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.1.1
>>
>
> --
> Steve Harris
> Garlik Limited, 2 Sheen Road, Richmond, TW9 1AE, UK
> +44(0)20 8973 2465  http://www.garlik.com/
> Registered in England and Wales 535 7233 VAT # 849 0517 11
> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>
>
Received on Wednesday, 29 July 2009 19:49:38 UTC