Re: Comments on draft-ietf-http-hit-metering-00.txt from Koen Holtman on 1997-02-23 (ietf-http-wg@w3.org from January to March 1997)

From: Koen Holtman <koen@win.tue.nl>
Date: Sun, 23 Feb 1997 19:33:32 +0100 (MET)
To: Jeffrey Mogul <mogul@pa.dec.com>
Cc: koen@win.tue.nl, http-wg@cuckoo.hpl.hp.com
Message-Id: <199702231833.TAA00177@wsooti08.win.tue.nl>
Jeffrey Mogul:
[...]
>We can quibble about whether the design does indeed provide
>adequate support for counting users of a page.  Perhaps the
>right statement would be
>   We believe that our design provides adequate support for
>   user-counting, within the constraints of what is feasible in the
>   current Internet, based on the following analysis.
[...]
>  We prefer to define "adequate" as "at least as
>accurate as is currently possible",

Ugh.  This `at least as accurate' is not very accurate at all.  And I can
think of several currently possible techniques, most of them involving
actual statistical methods, which wouch would be more accurate.

Bottom line: I want you to stop making _any_ positive claims about the
relation between hits and users.

[...]
>    2) I feel that there is too much unnecessary cruft in the draft.  The
>    usage limiting stuff should be removed, and the special rules for
>    varying resources should probably also be removed.  
>    
>Some people seem to prefer hit-counting over usage-limiting; some
>people prefer the opposite.

Please name the people who want to limit usage.  I only heard testimonials
about hit counting so far.  I do remember somone wanting to limit the re-use
of advertising gifs to once only, but that can already be done with the
existing caching primitives.

>  There is no clear consensus that one
>obviates the other.  Since both seem (to us) to be best served by
>slight variations of a single basic mechanism, we believe that it
>is appropriate to include both in the proposal.

Please rename the proposal `Simple Hit-Metering and Usage-Limiting for
HTTP', then.

>As for section 8, "Interactions with varying resources": this simply
>states the bare minimum necessary to make sensible use of the Vary
>mechanism as it is currently defined in the HTTP/1.1 RFC.

Section 8 is not minimalist, it maximises the info!

Smaller solutions, which still make sense to me, are:
 1) not counting for each combination of request headers, but for each
    entity tag only
 2) counting for each content-location only
 3) only one count for the entire resource

[...]
>    I feel that the IETF should not sanction this form of hit metering
>    (by making it a proposed standard) _unless_ it can be shown that not
>    doing so will lead to an internet meltdown.  
>    
>Since neither you nor Roy attended the San Jose session in February,
>and (although I supplied them in machine-readable form) the slides
>I presented there have not been posted as part of the minutes, I will
>quote from them here:
>
>     o Cons (real or alleged) [of our proposal]
>       - Slight overhead on the wire
>	  * This either pays off, or people won't use it
>       - Some storage overhead
>       - May reduce pressure on service authors to adopt more complex 
>	    proposals
>       - May not provide enough information to attract wide use
>     o Last two "cons" cannot both be true!

Both of the last two cons are bad.  If only one is true, that will be bad
enough.

I also note that you left out the `busting outside of the subtree' con,
which I find most significant.  To be honest, I do not know whether my talk
about frivolity before San Jose explained this con in an understandable way.
 
>
>To be specific, in this message, you yourself have stated
>	"Many people want web metrics better than what have now,
>	but this draft does not provide such metrics."
>and
>       "if the draft is adopted, some people who will do cache busting
>       now will switch to the hit counting methods in the draft."
>You simply can't have it both ways.

Explain.  I don't see a contradiction.  The people in the second quote will
switch because of the speed improvement, not because of any demographics
improvement.

Anyway, what I'm worried about is the sentence I wrote after the quote above:

   "However,
   others who don't count anything now may start using the draft, and
   this leads to _more_ cache busting outside of the metering subtree."

[...]
>I wouldn't waste the WG's time discussing proposals about "statistical
>sampling" until such time as we have seen a specific proposal.

I won't, too.  As far as I am concerned, the choice is between approving hit
metering and not approving hit metering.  Not doing anything is sometimes
the most logical course of action.

>-Jeff

Koen.
Received on Sunday, 23 February 1997 10:38:17 UTC