Re: QUERY Verb Proposal from Kingsley Idehen on 2015-01-20 (public-ldp-wg@w3.org from January 2015)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Tue, 20 Jan 2015 11:15:13 -0500
To: public-ldp-wg@w3.org
Message-ID: <54BE7F11.3010604@openlinksw.com>
On 1/20/15 8:29 AM, ashok malhotra wrote:
> We could define a link header that pointed to query specification.
> Then we could use this header on a GET.

Please forget about Link Headers, that's basically hard-coding and its 
won't make its way through standardization.

If you want to get going, without standardization inertia from the 
standards body process, simply look to Link: relations.

Basically, you can use Link: relations to express entity relations. 
Basically, this is just another RDF language notation, for all intents 
and purposes.

Examples:

My Personal Data Space:

curl -I http://kingsley.idehen.net/~kidehen/Public/
HTTP/1.1 200 OK
Server: Virtuoso/07.10.3211 (Linux) x86_64-redhat-linux-gnu  VDB
Connection: Keep-Alive
Date: Tue, 20 Jan 2015 16:09:40 GMT
Accept-Ranges: bytes
Content-Type: text/turtle
MS-Author-Via: DAV, SPARQL
Allow: 
GET,HEAD,POST,PUT,DELETE,OPTIONS,PROPFIND,PROPPATCH,COPY,MOVE,LOCK,UNLOCK,TRACE,PATCH
Accept-Patch: application/sparql-update
Accept-Post: text/turtle,text/n3,text/nt
Vary: Accept,Origin,If-Modified-Since,If-None-Match
Link: <http://www.w3.org/ns/ldp#Resource>; rel="type"
Link: <http://www.w3.org/ns/ldp#BasicContainer>; rel="type"
Link: <?p=1>; rel="first"
Link: <?p=1>; rel="last"
Link: <http://kingsley.idehen.net/DAV/home/kidehen/Public,meta>; rel="meta"
Link: <http://kingsley.idehen.net/DAV/home/kidehen/Public,acl>; rel="acl"

DBpedia:

curl -IL http://dbpedia.org/resource/Linked_Data

*After redirects:*

HTTP/1.1 200 OK
Date: Tue, 20 Jan 2015 16:12:17 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 46853
Connection: keep-alive
Vary: Accept-Encoding
Server: Virtuoso/07.10.3211 (Linux) x86_64-redhat-linux-gnu  VDB
Expires: Tue, 27 Jan 2015 16:12:17 GMT
Link: <http://dbpedia.org/data/Linked_data.rdf>; rel="alternate"; 
type="application/rdf+xml"; title="Structured Descriptor Document 
(RDF/XML format)", <http://dbpedia.org/data/Linked_data.n3>; 
rel="alternate"; type="text/n3"; title="Structured Descriptor Document 
(N3/Turtle format)", <http://dbpedia.org/data/Linked_data.json>; 
rel="alternate"; type="application/json"; title="Structured Descriptor 
Document (RDF/JSON format)", <http://dbpedia.org/data/Linked_data.atom>; 
rel="alternate"; type="application/atom+xml"; title="OData (Atom+Feed 
format)", 
<http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+<http://dbpedia.org/resource/Linked_data>&format=text%2Fcsv>; 
rel="alternate"; type="text/csv"; title="Structured Descriptor Document 
(CSV format)", <http://dbpedia.org/data/Linked_data.ntriples>; 
rel="alternate"; type="text/plain"; title="Structured Descriptor 
Document (N-Triples format)", 
<http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+<http://dbpedia.org/resource/Linked_data>&output=application%2Fmicrodata%2Bjson>; 
rel="alternate"; type="application/microdata+json"; title="Structured 
Descriptor Document (Microdata/JSON format)", 
<http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+<http://dbpedia.org/resource/Linked_data>&output=text%2Fhtml>; 
rel="alternate"; type="text/html"; title="Structured Descriptor Document 
(Microdata/HTML format)", 
<http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+<http://dbpedia.org/resource/Linked_data>&output=application%2Fld%2Bjson>; 
rel="alternate"; type="application/ld+json"; title="Structured 
Descriptor Document (JSON-LD format)", 
<http://dbpedia.org/resource/Linked_data>; 
rel="http://xmlns.com/foaf/0.1/primaryTopic", 
<http://dbpedia.org/resource/Linked_data>; rev="describedby", 
<http://mementoarchive.lanl.gov/dbpedia/timegate/http://dbpedia.org/page/Linked_data>; 
rel="timegate"
Cache-Control: max-age=604800
Accept-Ranges: bytes


Kingsley

>
> All the best, Ashok
>
> On 1/20/2015 3:27 AM, henry.story@bblfish.net wrote:
>>> On 20 Jan 2015, at 01:36, Sandro Hawke <sandro@w3.org> wrote:
>>>
>>> On 01/19/2015 05:30 PM, ashok malhotra wrote:
>>>> Sandro, some questions inline ...
>>>>
>>>> All the best, Ashok
>>>>
>>>> On 1/19/2015 5:19 PM, Sandro Hawke wrote:
>>>>> I suggest the spec say how to do it with GET as well as QUERY, and 
>>>>> what exactly the differences are.
>>>> Sorry, how would you do it with GET?  Are you talking about stored 
>>>> queries?
>>> Even with not-stored queries, you can still just turn them into a 
>>> GET, right?   (assuming the query is under 2k, at least, although 
>>> that IE limit might not apply to LDP situations)
>>>
>>> One simple design:
>>>
>>> 1.  Do a HEAD
>>> 2.  Response includes Link: <http://q.example> rel=query-via
>>> 3.  GET 
>>> http://q.example?r=url-of-original-resource&t=query-type&q=your-query
>>>
>>> It's approximately the interchange you'll need with QUERY:
>>>
>>> 1.  Do a HEAD
>>> 2.  Get back some response saying QUERY is supported
>>> 3.  QUERY url-of-original resource, body=your query
>>>
>>> One of the reasons the HTTP WG is very unlikely to standardize this 
>>> is that there's so little technical advantage to doing this with a 
>>> new verb (at least as far as I can see).   The main reasons would be 
>>> queries > 2k, but your saved queries solve that, and allowing 
>>> intermediate nodes to understand and cache based on query semantics, 
>>> ... and MAYBE the Get option would allow that.
>> Some of the disadvantages of your approach I can think of at present:
>>
>> • Queries are limited to < 2k
>> • URLs are no longer opaque. You can see this by considering the 
>> following:
>>     - if a cache wants to use the query URL to build up a partial 
>> representation of
>>     the original document, it would need to parse the query URL. So 
>> we end up with mime
>>     type information in the URL.
>>     - If the cache sees the query URL but does not know that the 
>> original resource
>>     is pointing to it, then it cannot build up the cache ( and it 
>> cannot know this
>>     without itself doing a GET on the original URL, because otherwise 
>> how would it deal
>>     with lying resources that claim to be partial representations of 
>> other URLs? )
>> • URL explosion: one ends up with a lot more URLs - and hence 
>> resource - than needed,
>>    with most resources being just partial representation of 
>> resources, instead of
>>    building up slowly complete representation of resources.
>> • caching
>>    - etags don't work the same way on two resources with two URLs as 
>> with one
>>      and the same URL
>>    - the same is true with time-to-live etc.
>>    - A PUT, PATCH, DELETE on the main resource won't tell the cache 
>> that it should
>>      update all the thousand of other resources that are just views 
>> on the
>>      original one
>>    - The cache cannot itself respond to queries
>>       A cache that would be SPARQL aware, should be able to respond
>>       to a SPARQL query if it has received the whole representation 
>> of the
>>       resource already - or indeed even a relevant partial 
>> representation )
>>       This means that a client can send a QUERY to the resoure via 
>> the cache
>>       and the cache should be able to respond as well as the remote 
>> resource
>> • Access Control
>>      Now you have a huge number of URLs referring to resources with 
>> exactly the same
>>      access control rules as the non query resource, with all that 
>> can go wrong, when
>>      those resources are not clearly linked to the original
>> • The notion of a partial representation of an original resource is 
>> much more opaque
>>    if not lost without the QUERY verb. The system is no longer 
>> thinking: "x is a partial
>>    representation of something bigger, that it would be interesting 
>> to have a more complete
>>    representation of"
>>     Btw. Do we have a trace of the arguments made in favor of PATCH. 
>> Then it would be a case
>> of seeing if we can inverse some of those arguments to see if we are 
>> missing any here.
>>
>>> BTW, all my query work these days is on standing queries, not one 
>>> time queries.  As such, I think you don't actually want the query 
>>> results to come back like this.   You want to POST to create a 
>>> Query, and in that query you specify the result stream that the 
>>> query results should come back on.  And then you GET that stream, 
>>> which could include results from many different queries.   That's my 
>>> research hypothesis, at least.
>>>
>>>        -- Sandro
>>>
>>>>> Assume the HTTP WG will say no for the first several years, after 
>>>>> which maybe you can start to transition from GET to QUERY.
>>>>>
>>>>> Alternatively, resources can signal exactly which versions of the 
>>>>> QUERY spec they implement, and the QUERY operation can include a 
>>>>> parameter saying which version of the query spec is to be used. 
>>>>> But this wont give you caching like GET.   So better to just use 
>>>>> that signaling for constructing a GET URL.
>>>> Gimme a little more to help me understand how this would work.
>>>>>        -- Sandro
>>>>>
>>>>>
>>>>
>>>
>> Social Web Architect
>> http://bblfish.net/
>>
>
>
>


-- 
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
Attachments

application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Tuesday, 20 January 2015 16:15:42 UTC