Another Cache-control: proposal

So far, we've included Cache-control: directives to allow servers
and clients to control transparency and security (privacy/authentication)
issues.

A lot of the "cache-busting" done by existing servers is for the
purpose not of maintaing transparency or security properties, but
simply to collect demographic information.  Someone also pointed
out that this is done to provide different advertising images to
appear on the same page (i.e., each retrieval of the page sees
a different ad).

During the discussions at last week's HTTP-WG meeting, it occured
to me that we could also use Cache-control: to allow servers to
limit the number of times a cache entry is used, which would provide
the origin server semi-accurate demographic information without
destroying the usefulness of caches.  That is, by bounding the
number of uses, the server bounds the inaccuracy of its demographic
information.  It can also bound the number of times the same ad
is shown because of caching.

Specifically, I'd propose using:
	Cache-control: max-uses=NNNN
on responses, where NNNN is a positive integer.

Suppose the origin server sends
	Cache-control: max-uses=10
on a response.  The cache receiving that response could provide
it to the current requestor, plus 9 more times (subject to the
other caching constraints, such as Expires:), if the responses
it generates included
	Cache-control: max-uses=1
Or, the cache could return
	Cache-control: max-uses=5
in its responses, to allow another cache to make multiple uses,
but then it would only be able to do this twice.  (The heuristics
a cache uses to sub-allocate its max-uses value are beyond the
scope of the HTTP spec.)

If a cache uses up its max-uses count, then it has to do a
conditional GET to "recharge" its count from the server.

I'd also propose a complementary
	Cache-control: use-count=NNNN
on requests.  This would tell the origin server (or an inbound
cache) how many times the cache had used the cached entry specified
by a conditional GET.

In the case of multiple caches along a path, the inbound cache
does the obvious summation when it receives a use-count value from
another cache.

Of course, if a cache never has to do a conditional GET before
it removes an entry from its cache, the server would not find out
about the uses since the last cache-to-server request.  The
max-uses directive at least provides a bound on how badly hits
are undercounted.

Perhaps we could define a "cooperative cache" as one that does a
HEAD on the resource (along with a "Cache-control: use-count"
header) when it removes it from the cache, just to let the
origin server know.

So how does an origin server know that the cache is willing to
obey max-uses?  Suppose that if the cache adds
	Cache-control: use-count=0
to its initial (non-conditional) GET, we interpret this to mean
"I am willing to obey max-uses".  A server receiving this could
expect (but not in a legally binding sense!) that the cache
would comply.  (Alas, this does not quite work if there is
an HTTP/1.0 cache between the HTTP/1.1 cache and the origin
server, but perhaps the Forwarded: header solves this.)

There is no requirement that an origin server send the same
max-uses value to all caches.  For example, it might make sense
to send "max-uses=2" the first time one hears from a cache,
and then double the value (up to some maximum limit) each time
one gets a "use-count" from that cache.  The idea is that
the faster a cache is using up its max-use quota, the more
likely it will be to report a use-count value before removing
the cache entry.  Also, high and frequent use-counts imply
a corresponding high efficiency benefit from allowing caching.
Again, the details of this heuristic would be outside the
scope of the HTTP spec.

Comments?

-Jeff

Received on Thursday, 14 March 1996 01:59:43 UTC