- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Thu, 18 Apr 96 18:18:54 MDT
- To: http-caching@pa.dec.com
This was one of the few remaining sections of the draft that I had
not been able to write. After the discussions of the past few days
(online and on the telephone), and a lot of puzzling, I think I've
been able to write a fairly concise set of rules for constructing
and using cache keys. Comments are welcome, but due to various
schedule constraints, something very close to this is going to be
in Monday's draft from Jim.
-Jeff
13.12 SLUSHY: Cache keys
A ``cache key'' is a value used to identify a cache entry. HTTP
caches construct keys in three different ways:
- Some subset of the fields stored with a cache entry
constitute the ``entry key'' for that entry. These may
include the Request-URI, some request-header fields, and
some response-header fields.
- Some subset of the fields of a response, together with
perhaps the Request-URI, constitute the ``replacement key''
of a response.
- Some subset of the fields of a request, together with the
Request-URI, constitute the ``lookup key'' of a request.
When a cache receives a request, it builds a lookup key from that
request, then tries to find (lookup) a cache entry with a matching
entry key. If such a match exists, then the cache can decide (based
on the other caching rules) whether to return that entry in reply to
the request.
When a cache receives a response, it builds a replacement key from
that response, and from the request that elicited it. It uses this
key to find any previously stored entry with a matching entry key.
If such an entry exists, the cache replaces the old entry with the
new one.
The term ``replacement'' means to remove the old entry from the
cache, and then to insert the new entry. It does not imply a
modification of an existing entry.
This section describes specifically how the three kinds of keys are
constructed, and how a cache determines if keys match.
13.12.1 SLUSHY: Key-matching procedure
We express replacement keys as a tuple (URI, variant-ID), in which
the variant-ID may be null. We express lookup keys as a tuple (URI,
variant-ID, all-request-headers), in which the variant-ID may be
null. The all-request-headers element of the tuple is not always
used, but is included here as a notational convenience. We express
entry keys as the tuple (URI, variant-ID, sel-hdr-values), in which
the variant-ID may be null, and the sel-hdr-values may either be
null, or may be a set of request headers.
A replacement key matches an entry key if both their URI elements
match and their variant-ID elements match. (A null variant-ID does
not match a non-null variant-ID.)
A lookup key matches an entry key if both their URI elements match
and their variant-ID elements match, and either
- the sel-hdr-values element of the entry key is null
or
- the sel-hdr-values element of the entry key matches the
appropriate headers in the all-request-headers element of
the request key, according to the matching rules in section
10.v.
This description matching algorithm is clearly not the most efficient
implementation of an equivalent algorithm. A cache may use any
algorithm that yields equivalent results. For example, it may use a
hierarchical approach where cache entries are grouped into sets by
the URI and variant-ID, and only if a set includes non-null
sel-hdr-values elements does the cache need to consider the other
request headers.
If on a cache lookup there are two or more entries that appear to
match the request, then the one with the most recent Date value MUST
be used.
13.12.2 FROZEN: Non-varying resources
When a response is received for a non-varying resource (that is, the
response includes no Vary, Alternates, or Content-Location headers),
the replacement key for the response is simply the Request-URI of the
request that elicited it: (Request-URI, null). The entry key for the
response is (Request-URI, null, null).
13.12.3 SLUSHY: Varying responses
If a response includes a Vary header, then we use the notation
``sel-hdr-values'' to denote the canonical form of the headers in the
corresponding request whose field-names are given in the Vary header.
If the response does not include a Vary header, then sel-hdr-values
is assigned the null value. Section 10.v defines the canonical form
for selecting headers.
|Actually, 10.v doesn't explicitly define this canonical form.|
|I propose that we define it as |
| |
| A set whose elements are sequences of request headers |
| with identical field-names. For a given field-name, the |
| corresponding element is the concatenation of the |
| request headers with that field-name, in exactly the |
| order that these fields appear in the request. |
| |
When a response is received that includes a variant-ID in a CVal
header (see section 10.102), but no Content-Location header, then the
replacement key is (Request-URI, variant-ID), and the entry key for
the response is (Request-URI, variant-ID, sel-hdr-values).
When a response is received that includes a Vary header and an opaque
validator, but no variant-ID or Content-Location header, then the
replacement key is (Request-URI, opaque-validator), and the entry key
for the response is (Request-URI, opaque-validator, sel-hdr-values).
This rule supports the ``selecting opaque validators''
mechanism described in section 13.8.4. The cache should
distinguish between actual variant-IDs and opaque-validators in
the variant-ID element of the entry key; a non-null
opaque-validator in an entry key DOES match a null variant-ID
in a lookup key.
When a response is received that includes both a variant-ID in a CVal
header, and a Content-Location header, then the replacement key is
(content-location-URI, variant-ID), and the entry key for the
response is (content-location-URI, variant-ID, sel-hdr-values).
When a response is received that includes a Content-Location header
but no variant-ID, then the replacement key is (content-location-URI,
null), and the entry key for the response is (content-location-URI,
null, sel-hdr-values).
|Question: should we insist that Vary: must be present if the |
|variant-ID scheme is used? |
|Question: this description follows Koen's model, especially |
|that it does NOT allow the cache to be clever about |
|replacements and lookups if a response includes a Vary header|
|but no variant-ID. For example, it seems conceptually |
|feasible for a cache in this case to match all of the |
|request-header fields specified by the Vary header between |
|the ones stored with an existing cache entry and the ones in |
|the current request, but as far as I can tell, Koen wants to |
|prohibit this. |
13.12.4 SLUSHY: Canonicalization of URIs
A cache, when comparing two URIs to decide if they match or not, a
cache MUST use a case-sensitive octet-by-octet comparison of the
entire URIs, with these exceptions:
- Following the rules from section 3.2.2:
* A port that is empty or not given is equivalent to
port 80.
* Comparisons of host names MUST be case-insensitive.
* Comparisons of scheme names MUST be case-insensitive.
* An empty abs_path is equivalent to an abs_path of "/"
- Characters except those in the reserved set and the unsafe
set (see section 3.2) are equivalent to their ``"%" HEX
HEX'' encodings.
For example, the following three URIs are equivalent:
http://abc.com:80/~smith/home.html
http://ABC.com/%7Esmith/home.html
http://ABC.com:/%7esmith/home.html
Received on Friday, 19 April 1996 01:35:07 UTC