- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Thu, 18 Apr 96 18:18:54 MDT
- To: http-caching@pa.dec.com
This was one of the few remaining sections of the draft that I had not been able to write. After the discussions of the past few days (online and on the telephone), and a lot of puzzling, I think I've been able to write a fairly concise set of rules for constructing and using cache keys. Comments are welcome, but due to various schedule constraints, something very close to this is going to be in Monday's draft from Jim. -Jeff 13.12 SLUSHY: Cache keys A ``cache key'' is a value used to identify a cache entry. HTTP caches construct keys in three different ways: - Some subset of the fields stored with a cache entry constitute the ``entry key'' for that entry. These may include the Request-URI, some request-header fields, and some response-header fields. - Some subset of the fields of a response, together with perhaps the Request-URI, constitute the ``replacement key'' of a response. - Some subset of the fields of a request, together with the Request-URI, constitute the ``lookup key'' of a request. When a cache receives a request, it builds a lookup key from that request, then tries to find (lookup) a cache entry with a matching entry key. If such a match exists, then the cache can decide (based on the other caching rules) whether to return that entry in reply to the request. When a cache receives a response, it builds a replacement key from that response, and from the request that elicited it. It uses this key to find any previously stored entry with a matching entry key. If such an entry exists, the cache replaces the old entry with the new one. The term ``replacement'' means to remove the old entry from the cache, and then to insert the new entry. It does not imply a modification of an existing entry. This section describes specifically how the three kinds of keys are constructed, and how a cache determines if keys match. 13.12.1 SLUSHY: Key-matching procedure We express replacement keys as a tuple (URI, variant-ID), in which the variant-ID may be null. We express lookup keys as a tuple (URI, variant-ID, all-request-headers), in which the variant-ID may be null. The all-request-headers element of the tuple is not always used, but is included here as a notational convenience. We express entry keys as the tuple (URI, variant-ID, sel-hdr-values), in which the variant-ID may be null, and the sel-hdr-values may either be null, or may be a set of request headers. A replacement key matches an entry key if both their URI elements match and their variant-ID elements match. (A null variant-ID does not match a non-null variant-ID.) A lookup key matches an entry key if both their URI elements match and their variant-ID elements match, and either - the sel-hdr-values element of the entry key is null or - the sel-hdr-values element of the entry key matches the appropriate headers in the all-request-headers element of the request key, according to the matching rules in section 10.v. This description matching algorithm is clearly not the most efficient implementation of an equivalent algorithm. A cache may use any algorithm that yields equivalent results. For example, it may use a hierarchical approach where cache entries are grouped into sets by the URI and variant-ID, and only if a set includes non-null sel-hdr-values elements does the cache need to consider the other request headers. If on a cache lookup there are two or more entries that appear to match the request, then the one with the most recent Date value MUST be used. 13.12.2 FROZEN: Non-varying resources When a response is received for a non-varying resource (that is, the response includes no Vary, Alternates, or Content-Location headers), the replacement key for the response is simply the Request-URI of the request that elicited it: (Request-URI, null). The entry key for the response is (Request-URI, null, null). 13.12.3 SLUSHY: Varying responses If a response includes a Vary header, then we use the notation ``sel-hdr-values'' to denote the canonical form of the headers in the corresponding request whose field-names are given in the Vary header. If the response does not include a Vary header, then sel-hdr-values is assigned the null value. Section 10.v defines the canonical form for selecting headers. |Actually, 10.v doesn't explicitly define this canonical form.| |I propose that we define it as | | | | A set whose elements are sequences of request headers | | with identical field-names. For a given field-name, the | | corresponding element is the concatenation of the | | request headers with that field-name, in exactly the | | order that these fields appear in the request. | | | When a response is received that includes a variant-ID in a CVal header (see section 10.102), but no Content-Location header, then the replacement key is (Request-URI, variant-ID), and the entry key for the response is (Request-URI, variant-ID, sel-hdr-values). When a response is received that includes a Vary header and an opaque validator, but no variant-ID or Content-Location header, then the replacement key is (Request-URI, opaque-validator), and the entry key for the response is (Request-URI, opaque-validator, sel-hdr-values). This rule supports the ``selecting opaque validators'' mechanism described in section 13.8.4. The cache should distinguish between actual variant-IDs and opaque-validators in the variant-ID element of the entry key; a non-null opaque-validator in an entry key DOES match a null variant-ID in a lookup key. When a response is received that includes both a variant-ID in a CVal header, and a Content-Location header, then the replacement key is (content-location-URI, variant-ID), and the entry key for the response is (content-location-URI, variant-ID, sel-hdr-values). When a response is received that includes a Content-Location header but no variant-ID, then the replacement key is (content-location-URI, null), and the entry key for the response is (content-location-URI, null, sel-hdr-values). |Question: should we insist that Vary: must be present if the | |variant-ID scheme is used? | |Question: this description follows Koen's model, especially | |that it does NOT allow the cache to be clever about | |replacements and lookups if a response includes a Vary header| |but no variant-ID. For example, it seems conceptually | |feasible for a cache in this case to match all of the | |request-header fields specified by the Vary header between | |the ones stored with an existing cache entry and the ones in | |the current request, but as far as I can tell, Koen wants to | |prohibit this. | 13.12.4 SLUSHY: Canonicalization of URIs A cache, when comparing two URIs to decide if they match or not, a cache MUST use a case-sensitive octet-by-octet comparison of the entire URIs, with these exceptions: - Following the rules from section 3.2.2: * A port that is empty or not given is equivalent to port 80. * Comparisons of host names MUST be case-insensitive. * Comparisons of scheme names MUST be case-insensitive. * An empty abs_path is equivalent to an abs_path of "/" - Characters except those in the reserved set and the unsafe set (see section 3.2) are equivalent to their ``"%" HEX HEX'' encodings. For example, the following three URIs are equivalent: http://abc.com:80/~smith/home.html http://ABC.com/%7Esmith/home.html http://ABC.com:/%7esmith/home.html
Received on Friday, 19 April 1996 01:35:07 UTC