- From: Amos Jeffries <squid3@treenet.co.nz>
- Date: Sat, 19 Jan 2013 21:55:03 +1300
- To: ietf-http-wg@w3.org
On 19/01/2013 8:58 a.m., James M Snell wrote: > Just one more for the day... Looking at Cache-Control.. Currently the > cache-control header consists of a list of named directives that > optionally have associated values. The format is extensible which is > great, but makes things a bit more difficult to optimize. Let's look > at a few random examples... > > Cache-Control: public (6 bytes) > Cache-Control: public, max-age=1600 (21 bytes) > Cache-Control: no-store, no-transform, must-revalidate (39 bytes) > > Let's see if we can do better. > > First off, let's assume that Cache-Control on requests can have a > different encoding than Cache-Control on responses. For requests, > let's make it: > > +----------+----------+---------------------+ > | no-cache | no-store | no-transform | > +----------+-----+----+---------+-----------+ > | only-if-cached |xxxx| max-age | max-stale | > +-----------+----+----+---------+-----------+ > | min-fresh | num-ext | repeating ext block | > +-----------+-----+---+---------+-----------+ > > no-cache = 1 bit > no-store = 1 bit > no-transform = 1 bit > only-of-cached = 1 bit > xxx = 4 reserved bits > I looked at these headers encoding for network-friendly-01 draft. I suggest we take a closer look at what those flags *mean* and translate the meaning into bits not a 1:1 mapping of the flags. There are a couple like no-cache which become much clearer and some missing HTTP/1 flags which become visible when we do that. storage: read-only, write-only, revalidate HTTP/1 -> HTTP/2 must-revalidate -> store revalidate no-cache -> store revalidate no-store -> store read-only only-if-cached -> store read-only, revalidate max-age=0 -> store write-only We are missing a "store write-only + revalidate" option in HTTP/1. Meaning fetch this object as a cache MISS but allow it to be stored for future use. At face value "no-cache" gives people to assume that it means store is write-only, but the specification details do not match the common assumption. store write-only + revalidate is missing as a single HTTP/1 flag. It sort of makes sense as a client driven force-refresh update which is forcibly revalidating the clients copy and only updates cached data IF it matches that same copy. A sort of cross between must-revalidate and only-if-cached. > max-age = uintvar > max-stale = uintvar > min-fresh = uintvar > num-ext = 1 byte > These ones are much clearer, but min-fresh is still a little obtuse in its naming. time-since-creation: min-age, max-age, max-stale We can also go a little further. min-age as a question only makes sense on requests, wanting an object of minimum age in response. On Responses it makes sense to use it as an answer saying this object is minimum of X age already. Which allows us to drop the Age: headers entirely and use the min-age Cache-Control bits to store the responses current age value. NOTE to Phillip in response to "Why do HTTP request messages have dates in them anyhow? If they do not cause a state machine to behave differently then lets get rid of them." All these age fields are timestamps relative to the Date: header on the response. Which allows us to store values of up to 1 year offset from the Date: epoch in 31-bits, with one bit for valid/invalid value marker. Caches can opt the 32-bit signedness bit as that marker for easy coding. PS. This is another reason I'm in favour of the 1-year default caching limitation. It would help us avoid wasting 4 bytes and doing 64-bit calculations on max-age which are usually short. content-transform: yes, no private : yes, no > repeating ext block = > > +---------------------------+ > |TYP|XXXXXX|len(key)|key|val| > +---------------------------+ > > TYP = 2 bit type code > 00 = Boolean, no val > 01 = Numeric, val is uintvar > 10 = Text, val is encoded text > 11 = Reserved > XXXXXX = Reserved Bits > if TYP is 00, then val is omitted. The idea is that this is a boolean > flag, like no-cache, no-store, etc. The key identifies the flag. Key > is a text label. > if TYP is 01, then val is uintvar. > if TYP is 02, then val is 2-byte length followed by encoded text > > So if we look at examples, then, > > Cache-Control: no-cache encodes as five-bytes > Cache-Control: only-if-cached, max-age=1600, encodes as seven-bytes > > Looking at the Cache-Control header for Responses we can do: > > +--------+---------+----------+-------------+ > | public | private | no-cache | no-transform| > +--------+-+-------+----------+-----------+-+ > | no-store | must-revalidate |proxy-reval|X| > +----------+----------+-------+-----------+-+ > | max-age | s-maxage | num-no-cache-headers| > +----------+-------+--+---------------------+ > | no-cache-headers | num-private-headers | > +------------------+------------------------+ > |private-headers|num-ext|repeating ext block| > +------------------+------------------------+ > > Same idea, > > public = 1 bit > private = 1 bit -1 bit. These are two sides of a single boolean. We can add that to HTTP/2 specification as a 1-bit flag to prevent future bungling and still be 1.1 compliant when it translates. > no-cache = 1 bit > no-transform = 1 bit > no-store = 1 bit > must-revalidate = 1 bit -1 bit. parameterless no-cache and must-revalidate are semantically equivalent. The presence or absence of no-cache-headers below takes care of the parametered no-cache cases where the semantics differ. > proxy-reval = 1 bit > X = reserved > max-age = uintvar > s-maxage = uintvar > num-no-cache-headers = 1-byte > no-cache-headers = null-byte separated list of header names > num-private-headers = 1-byte > private-headers = null-byte separated list of header names private-headers and no-cache-headers are a bit of an overlap. I personally would like to see them merged into one semantic field whih can be handled the same by caches. But we shall have to investigate that first. > Examples... > > Cache-Control: public (encodes as 6 bytes) > Cache-Control: public, max-age=1600 (encodes as 6 bytes, saving 17 bytes) > Cache-Control: no-store, no-transform, must-revalidate (encodes as 6 > bytes, saving 33 bytes) > > So looking at these examples, it is definitely possible to save a lot > of space but at the cost of quite a bit of encoding-complexity. I'm > sure we could possibly do better but this provides a good starting > point, and, it's bidirectionally compatible with 1.1. Whether or not > it's worth the effort is a different question entirely. > > - James With these alterations I end up with: storage controls: 1 byte +-+-+---+---+ |P|T|RWV|rwv| +-+-+---+---+ P = private T = no-transform R = shared cache read-only W = shared cache write-only V = shared cache must-revalidate r = private cache read-only w = private cache write-only v = private cache must-revalidate Caching heuristic age controls: 12 bytes +-+------+--------+--------+--------+ |V| min-age / Age: | +-+------+--------+--------+--------+ |V| max-age | +-+------+--------+--------+--------+ |V| max-stale | +-+------+--------+--------+--------+ V = invalid/unset. A static 13 bytes for cache controls no matter which are set. OR, a specified order for optional blocks with a byte up front flagging which blocks are omitted => variable 2-13 bytes, with on average 6 bytes for just the store controls block and max-age block. no-cache-headers and private-cache-headers, as said I'd like to see merged. If not they can be split into a separate header each with the same field-value format as Connection:. Amos
Received on Saturday, 19 January 2013 08:55:33 UTC