Re: Hit-metering: to Proposed Standard? from Jeffrey Mogul on 1996-11-21 (ietf-http-wg@w3.org from October to December 1996)

From: Jeffrey Mogul <mogul@pa.dec.com>
Date: Wed, 20 Nov 96 16:08:02 PST
To: Ingrid Melve <Ingrid.Melve@uninett.no>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <9611210008.AA00535@acetes.pa.dec.com>

    My question is: why bother with usage-limiting ?  (and implementing
    it in a caching mesh with multiple co-operating caches on each
    level in a "hierarchy" will be a pain). The purpose of bounding
    inaccuracy in a count is not achieved by adding usage-limiting, as
    the primary inaccuracy will be added by ill-behaving caches, users'
    inimitable inaccurate ways (check once per session, always check,
    never check), servers/network failure and so on (including all
    those that do hitmetering but do not honour the usage-limiting).
    The only useful thing is that it does limit the number of times a
    "well-behaved" cache server hands out the same ad. Is the
    complexity of usage-limiting worth it?

First of all, no proxy is required to implement usage-limiting.

Second, it is not clear that the additional implementation complexity
is particular large.

Third, we realized that we could do a good job of implementing
usage-limiting with a mechanism that is very similar to the one
we chose for hit-metering, and since the usage-limiting aspect
is entirely optional, we chose to include it in the specification.

Finally, we agree that the usage-limiting aspect of the design
is not as clearly useful as the hit-metering aspect.  But the
draft hints at a few other useful things that it could be used
for.  For example, it could be used with a (still-to-be-defined)
prefetching mechanism so that a prefetched result could be cached
for one use.  The existing "proxy-revalidate" mechanism in HTTP/1.1
doesn't really work optimally for prefetched responses.

    To illustrate the mesh problem: If I am a cache that gets handed 2
    usages through my first parent and 4 through my second parent, do I
    have to report back through the parent who gave me the usage-limit
    or can I freely chose to report back through my third parent (who
    has not connected to the origin server before and is likely to
    report 4 out of zero and 2 out of zero or generally confuse the
    origin server)? If I have to report through the appropriate parent,
    this requires me to store where I got the document from; and it
    heavily influences traffic patterns, robustness and the redundancy
    of my mesh.
    
The question is meaningless because usage-limiting and hit-metering
are independent, and usage-limiting does not require you to report
anything.

It is true that, for the case of hit-metering, if your proxy receives
otherwise identical responses from two different inbound servers, and
combines them according to the rules in section 13.2.6 of the HTTP/1.1
spec, then one has to decide which inbound server gets the reports.
But it seems reasonable to assume that any subsequent cache hit can
be mapped onto a "use" of one or the other of the responses.  I.e.,
when the cache combines the two responses, it can (arbitrarily) decide
to act as if the second one replaces the first, or it can act as if
the second one is ignored.  Once that choice is made, there is no
ambiguity about which inbound server receives the report.

As I said in my response to Ted Hardie, our specification probably
ought to say explicitly that the proxy needs to record the identity
of the immediate source of a response, and this is another example
where that is important.

-Jeff

Received on Thursday, 21 November 1996 17:02:08 UTC