Re: Comments on draft-mogul-http-hit-metering-00.txt from Jeffrey Mogul on 1997-02-19 (ietf-http-wg@w3.org from January to March 1997)

From: Jeffrey Mogul <mogul@pa.dec.com>
Date: Tue, 18 Feb 97 17:16:19 PST
To: Koen Holtman <koen@win.tue.nl>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <9702190116.AA26023@acetes.pa.dec.com>
As promised, some replies to Koen's comments of 26 Nov 1996
(which were made based on an earlier draft of the hit-metering
proposal).

    Section 3.1:
    
    >   When a proxy forwards a hit-metered or usage-limited response to a
    >   client (proxy or end-client) not in the metering subtree, it MUST
    >   omit the Meter header, and it MUST add "Cache-control:
    >   proxy-revalidate" to the response.
    
    I'd rather have the cache leave the Meter header in the request.  With
    your system, upstream caches cannot tell a `mission critical'
    "Cache-control: proxy-revalidate" added to make a shopping cart
    application safe from a `frivolous' "Cache-control: proxy-revalidate"
    added to get hit counts.

I suppose this could be made to work (by defining a Meter: response
directive as having no effect on a proxy if the response also contains
"Cache-control: proxy-maxage=0"), but it seems like a kludge (and
I'm not sure I've thought through all of the possible pitfalls).
But I don't think it is really worth making this change (which adds
overhead to message responses).

This is because I don't think it's appropriate to characterize this
use of "Cache-control: proxy-maxage=0" as "frivolous", especially
when used with usage-limiting.

    Stated differently: your current system will encourage some people to
    ignore all "Cache-control: proxy-revalidate" headers, and this is a
    bad thing, because it will make the web an unsafe place.
    
I think the same statement could be made about any of the cache-control
mechanisms, since (in the current world) they could almost all be used
either for "non-frivolous" purposes or for gathering demographic info.
If the Web degenerates into a situation where proxy caches don't take
server directives seriously, then servers will simply disable caching
by other means.  Our expectation is that the hit-metering mechanism
will increase the likelihood that proxy operators will play by the
rules, not decrease it.

Maybe we need something like
	Cache-control: server-frivolity-level=0.73
    
Sorry, that was a totally frivolous proposal :-)
    
    Section 3.5:
    
    >      2. When it forwards a conditional HEAD on the resource
    >         instance on behalf of one of its clients.
    
    HTTP/1.1 does not define conditional HEADs, you will have to define
    them yourself.
    
It doesn't specifically define "conditional HEADs", but it does
describe "conditional methods" in section 13.3:

   In HTTP/1.1, a conditional request looks exactly the same as a normal
   request for the same resource, except that it carries a special
   header (which includes the validator) that implicitly turns the
   method (usually, GET) into a conditional.

Elsewhere, it specifies headers with language such as:
   The If-None-Match request-header field is used with a method to make
   it conditional.
These parts were intentionally written (after intense negotiations)
to not be overly specific about what method(s) could be made conditional.
(We seem to have overlooked the description of If-Modified-Since,
however.)

    In an earlier message, I outlined a cheap `BogoHits' system which
    would get you good _click_ counts, where a one click was defined as
    one mouse click on a page link by an actual end user.

It would be nice to have a fully specified proposal to review,
as an Internet-Draft.
    
    Section 4.2:
    
    >Why max-uses is not a Cache-control directive
    
    I think your reasoning is flawed here: the Meter header can be used to
    negotiate on the honoring of max-uses no matter whether max-uses
    appears in the Cache-Control response header or in the Meter response
    header.

I'm not sure I would call the reasoning "flawed"; maybe it could
be called "leaving a few steps for the reader".  We could have
written the entire proposal to invent a new kind of hop-by-hop
Cache-control directive, to be carried in the existing
Cache-control header, but then this would be a modification to the
HTTP/1.1 spec.  Right now, the HTTP/1.1 spec says (14.9):

   Cache directives must be passed through by a proxy or gateway
   application, regardless of their significance to that application,
   since the directives may be applicable to all recipients along the
   request/response chain. It is not possible to specify a cache-
   directive for a specific cache.

Our current proposal only requires the HTTP/1.1 proposal to include
"proxy-maxage=0", which (as I've argued elsewhere) is needed for
many other reasons as well.
    
    Section 5.3.1:
    
    I believe your system will always count a page _pre_fetch done by a
    proxy as a hit, no matter whether the page is actually seen by a user
    later.  You need to fix this.
    
As I wrote in response to Ted Hardie, the entire issue of prefetching
in HTTP needs some more thought.  Most (if not all) researchers looking
at prefetching in other contexts have concluded that a prefetch request
ought to be distinguished as such in any case (e.g., to avoid preempting
demand fetches, and to avoid polluting any statistically-driven
prediction engines).  The hit-metering proposal might need to be
fixed, but the general problem is broader.
    
    I do not like the special treatment of varying resources, because of
    privacy, efficiency, and complexity reasons.  If negotiation is done
    with TCN, you will get good variant counts without  all this
    complexity, because each variant has its own URL in TCN.
    
Unless we are sure that all varying resources will use TCN (not
at all assured at this point), the hit-metering proposal needs
to define what happens to a response with a Vary: header.

    Even if cache busting turns out to be done mainly to get hit counts,
    one could still make a case against your proposal.  If we don't
    optimize unwanted origin server behavior, the unwanted behavior will
    disappear by itself eventually, because users will vote with their
    mouse-button and move on to faster sites.
    
Wishful thinking, I believe.  Users judge servers on their overall
utility, not just on the network-level aspects that we geeks are
so concerned with.  A server that manages to deliver value to users,
perhaps by using an inefficient technique, will probably gain users
over a faster site that doesn't make the users as happy (or a faster
site that charges for access, instead of selling ads).

-Jeff
Received on Tuesday, 18 February 1997 17:29:35 UTC