- From: Roy T. Fielding <fielding@avron.ICS.UCI.EDU>
- Date: Fri, 12 Apr 1996 06:58:26 -0700
- To: http-caching@pa.dec.com
- Cc: Jeffrey Mogul <mogul@pa.dec.com>, jg@w3.org, Koen Holtman <koen@win.tue.nl>
> On Last-Modified: It seems we agreed that the use of last-modified for > cache validation should be phased out. Most people would not like > proxies to rely on the Last-Modified value when combining ranges, > because of the 1.0 servers around. There did not seem to be much > support for Roy's idea to require that 1.1 servers make the > last-modified date a `strong' validator that is guaranteed to be > different for different entities, even if the entity bound to the > resource is updated twice in one second. No, that isn't what we agreed to [it sure as hell isn't what I agreed to, and there is no way I'd let that be said in the spec without a fight]. And the latter sentence is a mutation -- origin servers are already capable of defining what it means to be "modified" and when Last-Modified is changed. If there is no other available information, Last-Modified is sufficient and can be assumed to BE sufficient for all caching purposes. If it isn't sufficient, the provider of that information MUST supply something more reliable than Last-Modified (which itself is reliable 99.9999% of the time). You can make it more reliable by not caching entities that have a Date within one second of the Last-Modified, but that is an optimization which should be placed in the caching heuristics section. The only thing I agreed to was that weak validators may exist WITHIN the new syntax of VID (I still hate CVal). This must have no impact on the interpretation of Last-Modified given the lack of any additional information. I also don't agree about including the syntax and then saying that servers must not use it -- that is a waste of time. If the syntax is there, we must also provide sufficient description of how and why it is there -- otherwise, implementors will use it for the wrong reasons. The problem with the name is not with "weak" (they are indeed weak); it would make a great deal more sense to people if we'd just stop referring to them as "validators" (they aren't). As I mentioned, I refer to them as entity identifiers because that is what they do -- identify entities of the Request-URI. Here is a more straightforward syntax that avoids some of the pitfalls of basing a protocol element on a conceptual understanding of validity. EID = "EID" ":" entity-id If-EID = "If-EID" ":" ( "*" | 1#entity-id ) Unless-EID = "Unless-EID" ":" ( "*" | 1#entity-id ) entity-id = change-indicator [ ";" variant-id ] change-indicator = [ "W/" ] token variant-id = token The entire entity-id is case-sensitive. I put the weakness indicator up front because caches should not be required to look for it AND have to extract it from the middle of the field. I am not using double-quotes around the change-token (what was being called the validator) because it is generally better to avoid quoted values when they can be avoided (due to problems of charsets and the possibility of embedded quotes and the problems of whitespace lossage by gateways to non-HTTP environments). In any case, I see no advantage in giving people extra rope to hang themselves when the thing must be a computed function anyway in order to be reliable. Examples: EID: W/lkjsdrhfjh;5 EID: afgef5647fed;iso-8859-7 Unless-ID: W/lkjsdrhfjh;5, afgef5647fed;iso-8859-7 This syntax is combined with the following semantics: An entity-id SHOULD be supplied by the origin server for any entity which is cachable; however, its absence does not imply that the entity cannot be cached -- it only implies that the origin server is incapable or unwilling to provide this enhanced functionality. If the Request-URI corresponds to two or more variant representations, then a variant-id MUST be included in the entity-id to distinguish between those representations. The combination of Request-URI + variant-id must uniquely identify each variant representation of that resource. A cache may use the variant-id to distinguish between cached variant representations of the Request-URI if the EID header field is present in the cached entities. A cache may also use Content-Location for that purpose. [assuming we get it defined in time.] The change-indicator is used to indicate changes to the content of the resource uniquely identified by the Request-URI and variant-id. The change-indicator value SHOULD change when the content of an entity changes and SHOULD NOT change when the content remains the same. When the value changes, it MUST change to a value not already used for that entity within a timeframe for which there may still exist legitimately cached entities with the same change-indicator value. A change-indicator is called "strong" if the origin server guarantees that the value MUST change when the entity's content changes. The origin server MUST prefix the change-indicator with "W/" if it is not generated by a strong function (i.e., is known to be "weak"). Origin servers SHOULD use a strong generator function if any is available for that entity. Note: The "entity's content" refers to both the Entity-Body and all Entity-Header fields except Expires and Transfer-Encoding [the latter may be better described as a General-Header field anyway]. Two entity-id's can be compared for equality by byte-comparison, excluding whitespace between the components. An entity-id may be used as a precondition for the partial GET method using the If-EID or Unless-EID header fields. If the change-indicator is strong, the partial GET request may be completed by any cache with a cached entity having the same entity-id, unless a cache-control directive indicates otherwise. If the change-indicator is weak, the partial GET request MUST NOT be completed by a public cache. As far as I can tell, there is no reason that a private cache should be prevented from using a weak validator comparison, even for byte range insertion. There is one addition to If-EID and Unless-EID to specifically handle (without an interpretation hack) the cases of "any" and "none", which allows the prevention of overwriting existing resources on a PUT. Unless-EID: * means "unless any entity-id already exists for this Request-URI" (a.k.a., if no entity already exists), and If-EID: * means "if any entity-id already exists for this Request-URI" (a.k.a., unless no entity already exists). I think that represents enough compromise from me for one week and I am not in the mood for playing any more name games. If we can't settle this within the next 24 hours then I think 1.1 should go forward without any validators at all. ...Roy T. Fielding Department of Information & Computer Science (fielding@ics.uci.edu) University of California, Irvine, CA 92717-3425 fax:+1(714)824-4056 http://www.ics.uci.edu/~fielding/
Received on Friday, 12 April 1996 14:38:52 UTC