Re: Proposed HTTP SEARCH method update

On 26/04/2015 5:40 p.m., henry.story@bblfish.net wrote:
> 
>> On 25 Apr 2015, at 20:56, Mark Nottingham <mnot@mnot.net> wrote:
>> 
>> 
>>> On 26 Apr 2015, at 6:49 am, henry.story@bblfish.net wrote:
>>> 
>>> If search is cacheable when the Conent-Location of the response
>>> matches the effective request uri, how does that show that the
>>> SEARCH response is not cacheable?
>> 
>> It is cacheable — in the sense that it can be stored. However, that
>> stored response can only be used to satisfy future GET requests to
>> the same URI — which is probably not what you want.
>> 
>> HTTP caching operates upon representations of resource state, which
>> means accessing the contents of the cache is always GET (or HEAD).
> 
> Great!  We are getting closer to the core of the problem, which I had
> tried to address earlier :-)
> 
> I agree that HTTP caching should operate upon representations of
> resource state. My point it that SEARCH always returns a partial
> representation of the resource state, so that it should be cachable
> too. This means improving the caching stack so that it knows how to
> update partial representations.
> 
> To illustrate the different cases, let us take the example from
> http://datatracker.ietf.org/doc/draft-snell-search-method/ . The
> resource <http://example.org/contacts> is a small table of contacts. 
> Let us imagine that
> 
> A. GET followed by SEARCH -------------------------
> 
> 1. James makes a GET request on <http://example.org/contacts> via a
> cache C and C caches the returned  representation with etag1 ( and
> the same Location header ) 2. Ashok then makes a conditional SEARCH
> request on <http://example.org/contacts> via the same cache C with
> etag1 too

What you describe here is a GET being cached and then used to answer a
SEARCH.

This has no bearing on whether SEARCH is cacheable. The response to the
SEARCH remains itself regardless of whether it was generated from the
origin resource or a *complete* copy of the origin resource.


> 
> B. SEARCH followed by SEARCH ----------------------------
> 
> 1. a JS Agent in a browser does not know how large </contacts> is,

Therefore the cache cannot answer SEARCH C requesting 5 records, if the
SEARCH A only caused 3 records to be cached.

Meaning the SEARCH has dynamic variance in responses. The method alone
cannot state that the response is cacheable since it depends on other
case-specific criteria.



> and only needs to render a couple of fields from that table, so it
> sends a SEARCH for those fields to the server example.org via it's
> local in browser cache BC. 2. The same JS Agent later needs the same
> to fields again for some different purpose, and sends the same query
> again using a conditional GET with the same etag.

Wrong. It would need to send SEARCH again with the same variant
negotiation details (ie query and language). *IF* the cache has for any
reason discarded its earlier copy the server will be getting the
request. You really dont want it to be GET on a large database.


SEARCH as a method supplies similar cache needs and semantics to
GET+Range request. Except that there is no byte offsets defined in the
SEARCH request to make things easy for caches to identify object byte
overlaps in stored responses.

As such, Range can be used to best describe the caching problem with
SEARCH. ... Caching Range responses is possible but a 206 response with
10 bytes of length cannot be used to supply a single Range of 20 bytes.

With SEARCH this problem is slightly worse because we dont know if or
where there the missing records might be in the resource. We are forced
to send the whole SEARCH to the server to find out - which means the new
reply effectively replaces any cached one and there is no gain.
 At least with Range the cache could have optimize the backend query to
ask for bytes 11-20.


IFF we make the assumption that SEARCH is cacheable *somehow* - the
cache will need to compare the method, URI and all other negotiation
criteria defined by SEARCH as being relevant before it will serve up the
cached response. Any non-match is a different SEARCH query requiring the
backend server to supply new response.


I like your proposal that the response to SEARCH be treated as a partial
response. To the point that I also believe it should be indicated that
the preferred response status code is 206 with Content-Range header
indicating the positions or numbers (in bytes, or maybe a new range
type) of the fetched records within the base resource.

The 206 status is already defined in RFC 7233 with suitable criteria to
cover the SEARCH cases. The wording for it also allows a SEARCH "not
cacheable by default" definition to be ignored - that could be improved
by SEARCH adding a mention that 206 might be cacheable.

Amos

Received on Sunday, 26 April 2015 07:48:58 UTC