Re: Proposed HTTP SEARCH method update - SEARCH is to GET what PATCH is to PUT from henry.story@bblfish.net on 2015-04-27 (ietf-http-wg@w3.org from April to June 2015)

From: <henry.story@bblfish.net>
Date: Mon, 27 Apr 2015 10:52:01 +0200
To: Julian Reschke <julian.reschke@gmx.de>
Cc: ashok malhotra <ashok.malhotra@oracle.com>, Mark Nottingham <mnot@mnot.net>, James M Snell <jasnell@gmail.com>, ietf-http-wg@w3.org
Message-Id: <D944EB33-F5C1-4137-8DD1-9B29E79DE008@bblfish.net>
> On 27 Apr 2015, at 08:59, Julian Reschke <julian.reschke@gmx.de> wrote:
> 
> On 2015-04-27 07:36, henry.story@bblfish.net wrote:
>> ...
>> It would help if you explained how you disagree with the arguments put forward.
>> Let's try the Socratic method then:
>> 
>> (1) Do you agree that SEARCH is (should be) a method that is applied to the resource on which the request is made?
> 
> Depends on the definition of "applied". You might want to read <http://greenbytes.de/tech/webdav/rfc5323.html#rfc.section.2.2.1>.

I suppose I don't find that notion of "search arbiter" very satisfying in WebDAV. It is defined just as "A resource that supports the SEARCH method". My feeling is that by specifying SEARCH more generally in its own RFC we could do a lot better than this, and have something that ties in more coherently with the other methods. This is why I'd like to strengthen the following parallel: SEARCH is to GET what PATCH is to PUT. If we can get this to work then we build on very well understood intutions of GET and PUT, which are at the core of the Web, in a clearly RESTful manner.

Let us assume we do find a way to place SEARCH on more solid foundations like this. We'd still need to see if we can satisfy the intutions that WebDAV was trying to get at, but now in a RESTful manner. So I'll try to do both of these below.

WebDAV seems to be saying smething like this: an arbiter is an agent that can make requests on a number of other resources and respond to requests about them. This seem to define a resource that is dangerously close to a remote procedure call - it has a SOAPy feeling to it. Can we do this in a more declarative fashion?

I think so. We just need to notice that a resource - be it as simple as a 1993 web page - could make statements (true or false) about other resources. Most web pages do this. A trivial example but somehow exceedingly widely used one, are the set of web pages that point to other web pages with statements that they contain a funny cat picture. In more formal logical terms we can have belief statements that state what others believe eg: statements of the form S believes that P, S knows that P, S wishes that P, etc... This is knows as belief contexts and the more general notion has been described for exammple by Ramanathan V. Guha [1] - now VP of engineering at Google -  in his thesis "Contexts, a Formalisation and some Applications". Why do I mention this? Because once we see that a resources state may contain statements about other resources as shown by the representation returned on doing a GET on that resource,  then we can think declaratively about the SEARCH method, as giving a partial representation of the state of that resource, which may contain partial representation about other resources. And so we should not need this notion of a "search arbiter" anymore.

It may help if I  build up from simple to more complex examples.

1) Note that the CSV examples used in draft-snell-search-method-00 [3] are pure factual statements that contain no contexts. I would suggest the example first show an example of a GET representation of that resource, perhaps a partial representation, to emphasize that the resource may be very large. The SEARCH query on that resource should then return information that is a partial representation of that resource, and will not contain information about another resource, since the GET on that resource does not.

2) A slightly more complex example may be an Atom Feed for example from rfc4287 which contains information about other resources. The simplest example here has just one entry. A GET on such a feed returns us

<?xml version="1.0" encoding="utf-8"?>
   <feed xmlns="http://www.w3.org/2005/Atom">

     <title>Example Feed</title>
     <link href="http://example.org/"/>
     <updated>2003-12-13T18:30:02Z</updated>
     <author>
       <name>John Doe</name>
     </author>
     <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>

     <entry>
       <title>Atom-Powered Robots Run Amok</title>
       <link href="http://example.org/2003/12/13/atom03"/>
       <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
       <updated>2003-12-13T18:30:02Z</updated>
       <summary>Some text.</summary>
     </entry>
   </feed>

If I SEARCH this with a query language such as XQUERY I could end up with information about another resource, as seen by the feed resource - note that the feed resource may be out of date - and in that case the search on that state will also be out of date. For example I could send a SEARCH that would ask for titles of all the entries, which would return something like

<result>
  <title>Atom-Powered Robots Run Amok</title>
</result>

( a more complete example would be useful )
It seems that the client should be able to get the same information by 
1) doing a GET
2) applying the Query directly on the returned result

3) a Search engine like Google has a resource that has a state that is a practically infinitely long statement about what its crawler has seen on the web. Note that for some resources this may be different from what others may see when the do a GET on those resources, because the resources may have changed in the mean time, because the resource may be individualised for each user, or because the web site may lie to Google to get better search ranking. That is when Google makes a statement about what other pages on the web say, it is quoting what those sites told it (contexts are a formal quoting mechanism). When one does a SEARCH on that resource one gets back a view of that resource. Clearly doing a full GET on Google is practically impossible and not desireable. But logically the same thing is happening as above. The resource we would be SEARCHing on Google has as a state information about what it believes the state of other resources are on the web. By SEARCHing that resource we get the equivalent of what we would get were we to have first downloaded the state of their database and done ourselves a query on that.

I hope this helps give weight to my proposed tag line "SEARCH is to GET what PATCH is to PUT", and that this shows how we can get a more RESTful description of SEARCH and thereby putt it on a more solid foundation.



[1] http://www.guha.com/cv.html - VP of engineering at Google
[2] "Contexts, a Formalisation and some Applications" http://www-formal.stanford.edu/buvac/guha-thesis.ps
[3] http://datatracker.ietf.org/doc/draft-snell-search-method/
> 
>>   (a) if yes, then is the result usually the full representation returned as with GET? Or is it a partial representation?
>>   (b) if no, how does the result of a SEARCH relate to the resource on which the method was applied?
> 
> How and whether it relates depends on the request payload format and the actual request.

And on the state of the resource. My point is: SEARCH results should depend on only two things:

1) the state of the resource
2) the query

such that if the client ( or the cache ) had the full state of the resource, it could apply the query to that state and get the same result back.

Henry

> 
> Best regards, Julian
> 

PS. Thinking about 206 I think that it would make sense for SEARCH when the result returned by only a partial answer of the SEARCH result, so that RFC 7233 would allow one to page through huge SEARCH results.


Social Web Architect
http://bblfish.net/
Received on Monday, 27 April 2015 08:52:31 UTC