Re: QUERY Verb Proposal from Yves Lafon on 2015-01-20 (public-ldp-wg@w3.org from January 2015)

From: Yves Lafon <ylafon@w3.org>
Date: Tue, 20 Jan 2015 08:22:48 -0500 (EST)
To: "henry.story@bblfish.net" <henry.story@bblfish.net>
cc: Sandro Hawke <sandro@w3.org>, ashok malhotra <ashok.malhotra@oracle.com>, public-ldp-wg@w3.org
Message-ID: <alpine.DEB.2.00.1501200818080.8408@wnl.j3.bet>
On Tue, 20 Jan 2015, henry.story@bblfish.net wrote:

>> One of the reasons the HTTP WG is very unlikely to standardize this is 
>> that there's so little technical advantage to doing this with a new 
>> verb (at least as far as I can see).  The main reasons would be queries 
>> > 2k, but your saved queries solve that, and allowing intermediate 
>> nodes to understand and cache based on query semantics, ... and MAYBE 
>> the Get option would allow that.
>
> Some of the disadvantages of your approach I can think of at present:
>
> ? Queries are limited to < 2k
Source?

http://tools.ietf.org/html/rfc7230#section-3.1.1
<<
    HTTP does not place a predefined limit on the length of a
    request-line, as described in Section 2.5.  A server that receives a
    method longer than any that it implements SHOULD respond with a 501
    (Not Implemented) status code.  A server that receives a request-target
    longer than any URI it wishes to parse MUST respond with a 414 (URI Too
    Long) status code (see Section 6.5.12 of [RFC7231]).

    Various ad hoc limitations on request-line length are found in
    practice.  It is RECOMMENDED that all HTTP senders and recipients
    support, at a minimum, request-line lengths of 8000 octets.
>>

> ? URLs are no longer opaque. You can see this by considering the following:
>   - if a cache wants to use the query URL to build up a partial representation of
>   the original document, it would need to parse the query URL. So we end up with mime
>   type information in the URL.
URL templates anyone? <http://tools.ietf.org/html/rfc6570>

>   - If the cache sees the query URL but does not know that the original resource
>   is pointing to it, then it cannot build up the cache ( and it cannot know this
>   without itself doing a GET on the original URL, because otherwise how would it deal
>   with lying resources that claim to be partial representations of other URLs? )
> ? URL explosion: one ends up with a lot more URLs - and hence resource - than needed,
>  with most resources being just partial representation of resources, instead of
>  building up slowly complete representation of resources.
Querying something on the web using URIs is hardly new.

> ? caching
>  - etags don't work the same way on two resources with two URLs as with one
>    and the same URL
>  - the same is true with time-to-live etc.
>  - A PUT, PATCH, DELETE on the main resource won't tell the cache that it should
>    update all the thousand of other resources that are just views on the
>    original one
Why? This is an implementation detail server-side.

>  - The cache cannot itself respond to queries
>     A cache that would be SPARQL aware, should be able to respond
>     to a SPARQL query if it has received the whole representation of the
>     resource already - or indeed even a relevant partial representation )
>     This means that a client can send a QUERY to the resoure via the cache
>     and the cache should be able to respond as well as the remote resource
> ? Access Control
>    Now you have a huge number of URLs referring to resources with exactly the same
>    access control rules as the non query resource, with all that can go wrong, when
>    those resources are not clearly linked to the original
> ? The notion of a partial representation of an original resource is much more opaque
>  if not lost without the QUERY verb. The system is no longer thinking: "x is a partial
>  representation of something bigger, that it would be interesting to have a more complete
>  representation of"
>
> Btw. Do we have a trace of the arguments made in favor of PATCH. Then it would be a case
> of seeing if we can inverse some of those arguments to see if we are missing any here.
>
>>
>> BTW, all my query work these days is on standing queries, not one time queries.  As such, I think you don't actually want the query results to come back like this.   You want to POST to create a Query, and in that query you specify the result stream that the query results should come back on.  And then you GET that stream, which could include results from many different queries.   That's my research hypothesis, at least.
>>
>>       -- Sandro
>>
>>>>
>>>> Assume the HTTP WG will say no for the first several years, after which maybe you can start to transition from GET to QUERY.
>>>>
>>>> Alternatively, resources can signal exactly which versions of the QUERY spec they implement, and the QUERY operation can include a parameter saying which version of the query spec is to be used. But this wont give you caching like GET.   So better to just use that signaling for constructing a GET URL.
>>> Gimme a little more to help me understand how this would work.
>>>>
>>>>       -- Sandro
>>>>
>>>>
>>>
>>>
>>
>>
>
> Social Web Architect
> http://bblfish.net/
>
>
>

-- 
Baroula que barouleras, au tiéu toujou t'entourneras.

         ~~Yves
Received on Tuesday, 20 January 2015 13:22:50 UTC