- From: Koen Holtman <koen@win.tue.nl>
- Date: Sat, 13 Apr 1996 00:23:10 +0200 (MET DST)
- To: fielding@avron.ICS.UCI.EDU (Roy T. Fielding)
- Cc: http-caching@pa.dec.com, mogul@pa.dec.com, jg@w3.org, koen@win.tue.nl
Roy T. Fielding: >[Koen Holtman:] >> On Last-Modified: It seems we agreed that the use of last-modified for >> cache validation should be phased out. [...] >No, that isn't what we agreed to I guess I misinterpreted some part of the discussion then. As Jim Gettys also remembers we did not agree on this, I guess you are right and I was wrong. [...] >If there is no other available information, Last-Modified is sufficient >and can be assumed to BE sufficient for all caching purposes. If it >isn't sufficient, the provider of that information MUST supply something >more reliable than Last-Modified (which itself is reliable 99.9999% of >the time). My point against allowing Last-Modified values (generated by 1.0 servers) to combine ranges is not so much that it is unreliable in 0.0001% of the cases, but that there is nothing in the 1.0 spec that _disallows_ a 1.0 server to provide resources for which combining ranges using Last-Modified is 99.9% unreliable. You can require the above MUST for 1.1 servers, but not for 1.0 servers. A 1.0 server can quite legitimately serve, for each subsequent request on a resource, a HTML document randomly picked from a pool of 1000 HTML documents which all have the same semantic content without being byte equal, and such a server can legitimately tag all these 1000 HTML documents with the same Last-Modified date. This means that we cannot logically claim 1.1 to be downwardly compatible with 1.0 servers if we allow 1.1 clients to combine ranges with the same Last-Modified header. The incompatibility would maybe not be a practical problem, but it would be there, and its mere theoretical existence will contradict any claims made in the 1.1 document about compatibility between HTTP versions with the same major version number. [...] >Here is a more straightforward >syntax that avoids some of the pitfalls of basing a protocol element >on a conceptual understanding of validity. I agree to this syntax, though there are a few things I strongly disagree with below. > > EID = "EID" ":" entity-id > If-EID = "If-EID" ":" ( "*" | 1#entity-id ) > Unless-EID = "Unless-EID" ":" ( "*" | 1#entity-id ) > > entity-id = change-indicator [ ";" variant-id ] > > change-indicator = [ "W/" ] token > variant-id = token > >The entire entity-id is case-sensitive. I put the weakness indicator up >front because caches should not be required to look for it AND have to >extract it from the middle of the field. > >I am not using double-quotes around the change-token (what was being >called the validator) because it is generally better to avoid quoted >values when they can be avoided (due to problems of charsets and the >possibility of embedded quotes and the problems of whitespace lossage >by gateways to non-HTTP environments). In any case, I see no advantage >in giving people extra rope to hang themselves when the thing must be >a computed function anyway in order to be reliable. Side remark: I think the chance of CGI authors hanging themselves writing functions that make tokens are is greater than the chance of CGI authors hanging themselves writing functions that make quoted strings. When making tokens, you have to know all about which characters cannot be in tokens because they are in `tspecials'. I certainly don't know all tspecials by heart. How many CGI authors will not bother to look it up? > >Examples: > > EID: W/lkjsdrhfjh;5 > EID: afgef5647fed;iso-8859-7 > > Unless-ID: W/lkjsdrhfjh;5, afgef5647fed;iso-8859-7 > >This syntax is combined with the following semantics: > > An entity-id SHOULD be supplied by the origin server for any entity > which is cachable; however, its absence does not imply that the > entity cannot be cached -- it only implies that the origin server is > incapable or unwilling to provide this enhanced functionality. > > If the Request-URI corresponds to two or more variant representations, > then a variant-id MUST be included in the entity-id to distinguish > between those representations. I disagree to this MUST: the alternates in transparent content negotiation do not have variant-ids in draft-holtman, they have alternate URIs. The above MUST does not allow transparent content negotiation according to draft-holtman on top of 1.1, and we have consensus that this must be allowed. The above text should read: If the Request-URI identifies a varying resource *which uses the Vary header* to indicate variance, then a variant-id MUST be included in the entity-id to distinguish between different variant entities. The 1.1 spec can, but does not need to, add If the Request-URI identifies a varying resource *which uses the Alternates header* to indicate variance, then a variant-id should not be included, but the change-indicators by themselves SHOULD be different for all different variant entities. I strongly object to a requirement that varying resources which use the Alternates header to indicate variance MUST include variant-ids. Such a requirement would be easy to satisfy for preemptive negotiation, but very painful for reactive negotiation, where a second request is done on the actual URI of the alternate resource (which may even live on a different server). If the second response has to include a variant-id, then the alternate resource must `know' that it is in fact being used as an alternate resource by some transparently negotiated resource. This requirement to know would cause immense, and completely unnecessary, logistics problems for the authors of transparently negotiated resources. > The combination of > Request-URI + variant-id > must uniquely identify each variant representation of that resource. No, that is The combination of Request-URI + variant-id (if present) + Content-Location header (if present) must uniquely identify each variant representation of that resource. > A cache may use the variant-id to distinguish between cached variant > representations of the Request-URI if the EID header field is present > in the cached entities. A cache may also use Content-Location for that > purpose. [assuming we get it defined in time.] We need to define Content-Location in time, because my text about cache replacement for varying resources uses it. > The change-indicator is used to indicate changes to the content of > the resource uniquely identified by the Request-URI and variant-id. > The change-indicator value SHOULD change when the content of an entity > changes and SHOULD NOT change when the content remains the same. > When the value changes, it MUST change to a value not already used for > that entity within a timeframe for which there may still exist > legitimately cached entities with the same change-indicator value. As Jeff has pointed out in his more theoretical caching models, one can legitimately store in cache memory a stale response forever. So your requirement above is better expressed as: When the value changes, it MUST change to a value not already used. > A change-indicator is called "strong" if the origin server guarantees > that the value MUST change when the entity's content changes. The > origin server MUST prefix the change-indicator with "W/" if it is > not generated by a strong function (i.e., is known to be "weak"). > > Origin servers SHOULD use a strong generator function if any is > available for that entity. > > Note: The "entity's content" refers to both the Entity-Body and all > Entity-Header fields except Expires and Transfer-Encoding [the latter > may be better described as a General-Header field anyway]. > > Two entity-id's can be compared for equality by byte-comparison, > excluding whitespace between the components. > > An entity-id may be used as a precondition for the partial GET method > using the If-EID or Unless-EID header fields. If the change-indicator > is strong, the partial GET request may be completed by any cache with > a cached entity having the same entity-id, unless a cache-control > directive indicates otherwise. If the change-indicator is weak, the > partial GET request MUST NOT be completed by a public cache. We can actually require something weaker here. We need only require that: If two or more partial responses are be merged by a client (proxy or user agent or user agent helper application) into a complete response, or bigger partial response, these partial responses MUST both have the same strong change-indicator. This would make it OK for partial GETs to be completed by caches if the change-indicator is weak, which is I believe something you wanted. >As far as I can tell, there is no reason that a private cache should >be prevented from using a weak validator comparison, even for byte range >insertion. As far as I can tell, the privacy requirements of a response are orthogonal to the server's ability to supply a strong validator. So a private cache should also be prevented from using a weak validator comparison for combining ranges. >There is one addition to If-EID and Unless-EID to specifically handle >(without an interpretation hack) the cases of "any" and "none", which >allows the prevention of overwriting existing resources on a PUT. > > Unless-EID: * > >means "unless any entity-id already exists for this Request-URI" > (a.k.a., if no entity already exists), and > > If-EID: * > >means "if any entity-id already exists for this Request-URI" > (a.k.a., unless no entity already exists). > This addition would be OK with me. >I think that represents enough compromise from me for one week and I am >not in the mood for playing any more name games. Note that I carefully avoided playing name games above :) > If we can't settle >this within the next 24 hours then I think 1.1 should go forward without >any validators at all. I do not agree to removing validators if we can't settle all issues connected to them in 24 hours. Removing validators would make accessing varying resources way to expensive. We have consensus that the 1.1 Vary header should be good enough to support multi-lingual servers. Removing If-Invalid would make use of Vary so expensive that it is hardly usable. > ...Roy T. Fielding Koen.
Received on Friday, 12 April 1996 22:54:23 UTC