W3C home > Mailing lists > Public > ietf-http-wg@w3.org > April to June 2009

Re: draft-ietf-httpbis-p6-cache-06

From: Adrien de Croy <adrien@qbik.com>
Date: Wed, 27 May 2009 15:55:06 +1200
Message-ID: <4A1CB99A.7090106@qbik.com>
To: Brian Smith <brian@briansmith.org>
CC: 'HTTP Working Group' <ietf-http-wg@w3.org>


Brian Smith wrote:
> Adrien de Croy wrote:
>   
>> This leaves problems if there's no Vary header but content negotiation
>> was used.  It's not possible to reliably heuristically determine if
>> content-negotiation was used without the Vary header.  Vary is only a
>> SHOULD level.  Maybe it should be a MUST level?
>>     
>
> As you noted below, this rule applies:
>
>    Caches MUST use the most recent response (as determined by the Date
>    header) when more than one suitable response is stored.  They can
>    also forward a request with "Cache-Control: max-age=0" or "Cache-
>    Control: no-cache" to disambiguate which response to use.
>
> Using this rule, along with other rules for caches, results in predictable
> behavior, right?
>   
Forcing revalidation doesn't in my mind solve the problem of which (say) 
ETags to put in an If-None-Match header.  I guess you then put them all 
in, but that's then just getting the origin server to re-process the 
negotiation algorithm.  At least it's correct then though.

>> It says if there are several stored representations serve the one with
>> the most recent Date header (MUST level), but this may not be the
>> appropriate one if Vary headers aren't available, and you are say
>> selecting based on language.
>>     
>
> The server needs to provide a Vary header in that case.
>   
right, hence my comment maybe Vary should be a MUST.  But I'm sure there 
would be a multitude of problems if that happened.

 So I guess if there's no Vary, what do we we fall back to?

>   
>> S 2.5 Request methods that invalidate
>> -------------------------------------
>> I don't understand how a URI can be compared to a Content-Location or
>> Location header and match yet the host part be different.  Surely to
>> match the host part must be the same?  It's not clear to me what's
>> being matched with what.
>>     
>
> Example 1:
> The Request-URI is  http://example.ORG/foo.
> Content-Location in the response is http://example.COM/bar.
> In this case, we shouldn't invalidate cached representations of
> http://example.COM/foo because the request-URI's host doesn't match the host
> of the Content-Location header.
>
> Example 2:
> The Request-URI is http://example.ORG/foo.
> Content-Location is http://example.ORG/bar.
> In this case, we should invalidate both http://example.ORG/foo and
> http://example.ORG/bar.
>
> Basically, this is trying to implement a "same origin" policy for cache
> invalidation.
>
>   
thanks - that makes more sense now, esp after reading the section in 
Part 3 on Content-Location as well.

>> Wrt POST (or any method).  If the response to a POST is marked
>> explicitly by the origin server as cachable, why should a subsequent
>> POST invalidate that contrary to other Cache-control directives?
>> Surely this should only apply if the original method was not POST?
>>     
>
> See the discussion about whether the method is part of the cache key. Caches
> really need to be very conservative here (that is, MUST invalid) as there
> seems to be a lot of disagreement amongst implementers and standardistas
> regarding this issue.
>   
The basis I've been working on is that the method should be part of the 
key as long as it's not GET or HEAD, since otherwise you couldn't reply 
to a HEAD with the cached results of a previous GET as you'd get an 
index miss.

In that case there's no ambiguity about the other methods, so we'd be 
back to arguably needlessly invalidating valid stored representations.

>   
>> S 2.6 Caching Negotiated responses
>> ----------------------------------
>> Should I then be referring to Section 4.1 of [part3] to resolve the
>> issues around content negotiation?  If so, maybe a mention in S 2.2
>> would be useful.
>>     
>
> No, see above.
>
>   
>> I also don't understand in para 5 the sentence "If the server responds
>> with 304 (Not Modified) and includes an entity tag or Content-Location
>> that indicates the entity to be used"  How can Content-Location be used
>> to select an entity?  Do you match on previously returned
>> Content-Location headers for requests for the same URI?  Is that what
>> the final para is getting at?  Maybe the wording could be a bit clearer.
>>     
>
> Content-Location is only used by caches for invalidation, and never for any
> other reason (by caches). Basically, when choosing which cache entries to
> invalidate, you must invalidate all the ones with the same Content-Location,
> subject to the "same-origin" restriction explained above. 
>
>   

If a shared cache however appends a bunch of ETags to an If-None-Match 
header, then the server selects from those using Content-Location, then 
the shared cache / proxy needs to return the resultant selected entity?  
Same as if the origin server returned an ETag.

>> S 3.2 Cache-Control
>> -------------------
>> first sentence states that directives MUST be obeyed.   This doesn't
>> fit with a strategy of ignoring unhandled directives if you get a
>> mixture of request and response directives in a message (which is
>> still allowed in the ABNF). I think it should therefore be
>> explicit that it's not valid to mix the directives, else you get
>> a MUST requirement to obey nonsensical directives.
>>     
>
> A directive that looks like a cache-response-directive in a request is
> actually a cache-extension, not a cache-response-directive. Similarly, a
> directive that looks like a cache-request-directive in a response is
> actually a cache-extension, not a cache-request-directive. The grammar is
> just wrong.
>
>   
OK, that makes sense.

>> Otherwise relax the MUST, or relax it to the extent of nonsensical
>> directives.  Also, you can't have a MUST requirement on an extensible
>> mechanism.  Extensions need to be optional.
>>     
>
> "Unrecognized cache directives MUST be ignored." But, caches must recognize
> all directives defined in the HTTP spec.
>
>   
Agreed.

>> Some cache control directives are confusingly similar, especially for
>> response directives.
>>     
>
> I agree.
>
>   
>> 1. private directive with headers.
>> 2. no-cache.
>>     
>
> I will reply in a separate message.
>
>   
>> S 3.4 Pragma.
>> -------------
>> The BNF for this mentions extension-pragma.  Were there ever any of
>> these?  Does it make sense to continue to support an extension
>> mechanism on a deprecated header that no-one extended?
>>     
>
> There are undoubtedly extension-pragmas being used which are not defined in
> any standard.
>
>   
OK

>> I've also seen some responses lately that have multiple Cache-Control
>> headers - is this valid?
>>     
>
> Yes. See the rules for repeated header field values in Part 1.
>
>   
thanks - your comments are most helpful.

Regards

Adrien


> - Brian
>
>
>
>
>   

-- 
Adrien de Croy - WinGate Proxy Server - http://www.wingate.com
Received on Wednesday, 27 May 2009 03:52:22 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:51:03 GMT