Cache keys

This was one of the few remaining sections of the draft that I had
not been able to write.  After the discussions of the past few days
(online and on the telephone), and a lot of puzzling, I think I've
been able to write a fairly concise set of rules for constructing
and using cache keys.  Comments are welcome, but due to various
schedule constraints, something very close to this is going to be
in Monday's draft from Jim.

-Jeff

13.12 SLUSHY: Cache keys
   A ``cache key'' is a value used to identify a cache entry.  HTTP
   caches construct keys in three different ways:

      - Some subset of the fields stored with a cache entry
        constitute the ``entry key'' for that entry.  These may
        include the Request-URI, some request-header fields, and
        some response-header fields.

      - Some subset of the fields of a response, together with
        perhaps the Request-URI, constitute the ``replacement key''
        of a response.

      - Some subset of the fields of a request, together with the
        Request-URI, constitute the ``lookup key'' of a request.

   When a cache receives a request, it builds a lookup key from that
   request, then tries to find (lookup) a cache entry with a matching
   entry key.  If such a match exists, then the cache can decide (based
   on the other caching rules) whether to return that entry in reply to
   the request.

   When a cache receives a response, it builds a replacement key from
   that response, and from the request that elicited it.  It uses this
   key to find any previously stored entry with a matching entry key.
   If such an entry exists, the cache replaces the old entry with the
   new one.

      The term ``replacement'' means to remove the old entry from the
      cache, and then to insert the new entry.  It does not imply a
      modification of an existing entry.

   This section describes specifically how the three kinds of keys are
   constructed, and how a cache determines if keys match.

13.12.1 SLUSHY: Key-matching procedure
   We express replacement keys as a tuple (URI, variant-ID), in which
   the variant-ID may be null.  We express lookup keys as a tuple (URI,
   variant-ID, all-request-headers), in which the variant-ID may be
   null.  The all-request-headers element of the tuple is not always
   used, but is included here as a notational convenience.  We express
   entry keys as the tuple (URI, variant-ID, sel-hdr-values), in which
   the variant-ID may be null, and the sel-hdr-values may either be
   null, or may be a set of request headers.

   A replacement key matches an entry key if both their URI elements
   match and their variant-ID elements match.  (A null variant-ID does
   not match a non-null variant-ID.)

   A lookup key matches an entry key if both their URI elements match
   and their variant-ID elements match, and either

      - the sel-hdr-values element of the entry key is null

   or

      - the sel-hdr-values element of the entry key matches the
        appropriate headers in the all-request-headers element of
        the request key, according to the matching rules in section
        10.v.

   This description matching algorithm is clearly not the most efficient
   implementation of an equivalent algorithm.  A cache may use any
   algorithm that yields equivalent results.  For example, it may use a
   hierarchical approach where cache entries are grouped into sets by
   the URI and variant-ID, and only if a set includes non-null
   sel-hdr-values elements does the cache need to consider the other
   request headers.

   If on a cache lookup there are two or more entries that appear to
   match the request, then the one with the most recent Date value MUST
   be used.

13.12.2 FROZEN: Non-varying resources
   When a response is received for a non-varying resource (that is, the
   response includes no Vary, Alternates, or Content-Location headers),
   the replacement key for the response is simply the Request-URI of the
   request that elicited it: (Request-URI, null).  The entry key for the
   response is (Request-URI, null, null).

13.12.3 SLUSHY: Varying responses
   If a response includes a Vary header, then we use the notation
   ``sel-hdr-values'' to denote the canonical form of the headers in the
   corresponding request whose field-names are given in the Vary header.
   If the response does not include a Vary header, then sel-hdr-values
   is assigned the null value.  Section 10.v defines the canonical form
   for selecting headers.

      |Actually, 10.v doesn't explicitly define this canonical form.|
      |I propose that we define it as                               |
      |                                                             |
      |  A set whose elements are sequences of request headers      |
      |  with identical field-names.  For a given field-name, the   |
      |  corresponding element is the concatenation of the          |
      |  request headers with that field-name, in exactly the       |
      |  order that these fields appear in the request.             |
      |                                                             |

   When a response is received that includes a variant-ID in a CVal
   header (see section 10.102), but no Content-Location header, then the
   replacement key is (Request-URI, variant-ID), and the entry key for
   the response is (Request-URI, variant-ID, sel-hdr-values).

   When a response is received that includes a Vary header and an opaque
   validator, but no variant-ID or Content-Location header, then the
   replacement key is (Request-URI, opaque-validator), and the entry key
   for the response is (Request-URI, opaque-validator, sel-hdr-values).

      This rule supports the ``selecting opaque validators''
      mechanism described in section 13.8.4.  The cache should
      distinguish between actual variant-IDs and opaque-validators in
      the variant-ID element of the entry key; a non-null
      opaque-validator in an entry key DOES match a null variant-ID
      in a lookup key.

   When a response is received that includes both a variant-ID in a CVal
   header, and a Content-Location header, then the replacement key is
   (content-location-URI, variant-ID), and the entry key for the
   response is (content-location-URI, variant-ID, sel-hdr-values).

   When a response is received that includes a Content-Location header
   but no variant-ID, then the replacement key is (content-location-URI,
   null), and the entry key for the response is (content-location-URI,
   null, sel-hdr-values).

      |Question: should we insist that Vary: must be present if the |
      |variant-ID scheme is used?                                   |

      |Question: this description follows Koen's model, especially  |
      |that it does NOT allow the cache to be clever about          |
      |replacements and lookups if a response includes a Vary header|
      |but no variant-ID.  For example, it seems conceptually       |
      |feasible for a cache in this case to match all of the        |
      |request-header fields specified by the Vary header between   |
      |the ones stored with an existing cache entry and the ones in |
      |the current request, but as far as I can tell, Koen wants to |
      |prohibit this.                                               |

13.12.4 SLUSHY: Canonicalization of URIs
   A cache, when comparing two URIs to decide if they match or not, a
   cache MUST use a case-sensitive octet-by-octet comparison of the
   entire URIs, with these exceptions:

      - Following the rules from section 3.2.2:

           * A port that is empty or not given is equivalent to
             port 80.

           * Comparisons of host names MUST be case-insensitive.

           * Comparisons of scheme names MUST be case-insensitive.

           * An empty abs_path is equivalent to an abs_path of "/"

      - Characters except those in the reserved set and the unsafe
        set (see section 3.2) are equivalent to their ``"%" HEX
        HEX'' encodings.

   For example, the following three URIs are equivalent:

      http://abc.com:80/~smith/home.html
      http://ABC.com/%7Esmith/home.html
      http://ABC.com:/%7esmith/home.html

Received on Friday, 19 April 1996 01:35:07 UTC