Re: Comments on draft-mogul-http-hit-metering-01.txt

Jeffrey Mogul:
>
>Catching up on my old email ...
>
>Koen seems to have two major objections to the current
>draft of the hit-metering document:
>	(1) he believes that the introduction of hit-metering
>	will decrease, not increase, the amount of caching that
>	takes place.  I.e., it will increase, not decrease,
>	total traffic.

I don't believe that it will decrease it, I'm just not sure which way
things will turn out.  And see my apache module horror scenario in one
of my previous messages.

>	
>	(2) he objects to "_any_ positive claims about the
>	relation between hits and users".

Yes.

>
>Koen also has some minor objections, most of which I am happy
>to resolve.

OK.


>I'll address these objections point-by-point, but first I want
>to make one thing clear: where the difference between our
>positions is a matter of opinion which cannot be resolved
>by existing data, then a negative opinion about a proposed
>protocol specification (especially a fully optional extension)
>is not sufficient reason to kill the specification.

I agree that one opinion should not kill a spec.  If a lot of people
want the IETF to recommend this spec, it should go forward, even if
there is a vocal minority (me) which would not want to recommend it.

[..long analysis, which I agree with, removed..]

>So it basically comes down to making guess about these hypotheses,
>about the worst-case, best-case, and likeliest scenarios, and
>(whether we act or not) taking a risk that we're making the wrong
>choice.  While Koen writes "Not doing anything is sometimes
>the most logical course of action", I don't think he has made
>a strong case that "not doing anything" about cache-busting is
>actually our best bet.

I agree that I have not made a strong case for doing nothing, but I
note that your case for doing something is not that strong either.
That is why I would prefer the draft to be an experimental RFC.


>On to Koen's other major complaint.
>
>Section 4 of the hit-metering draft is labelled "Analysis",
>and starts:
>   We recognize that, for many service operators, the single most
>   important aspect of the request stream is the number of distinct
>   users who have retrieved a particular entity. We believe that our
>   design provides adequate support for user-counting, based on the
>   following analysis.
>After a complaint from Koen, I revised the second sentence so that
>this now reads:
>   We recognize that, for many service operators, the single most
>   important aspect of the request stream is the number of distinct
>   users who have retrieved a particular entity. We believe that our
>   design provides adequate support for user-counting, within the
>   constraints of what is feasible in the current Internet, based on 
>   the following analysis.
>
>Note that this language is NOT a part of the specification per se,
>and is heavily qualified; we use phrases like "for many service
>operators", "we believe", "based on the following analysis".  This
>is NOT a statement of fact, it's an opinion, and clearly labelled
>as such.

I feel that your clearly labeled opinion is a misleading claim, which
you might get away with in an advertisement, but which has no place in
a standards track RFC.

Your redefinition of the `adequate' in `We believe that our
design provides adequate support for user-counting' by

|We prefer to define "adequate" as "at least as
|accurate as is currently possible",

makes the claim even more ludicrous.

>
>Koen responds:
>    [I] can think of several currently possible techniques, most of
>    them involving actual statistical methods, which would would be
>    more accurate.
>    
>    Bottom line: I want you to stop making _any_ positive claims about the
>    relation between hits and users.
>
>I'd surely like to see a well-defined description, including some
>analysis, of these other possible techniques, and perhaps James
>Pitkow's paper (when it becomes available) will shed some light.

I have given references to several such descriptions in the past.

>But I'm not interested in continuing a debate of the form "I
>know a better way to do this, but I'm not going to provide a
>detailed argument about why it is better."

I don't like the implication that I am conducting this form of debate.

[....]
>We also observe that existing techniques (either cache-busting or
>full caching) can, and usually do, give much worse approximations
>than hit-metering would.  Koen has not argued this point.

Other existing techniques (either setting unique cookies or letting
users authenticate themselves) can, and usually do, give much better
approximations than hit-metering would.  What exactly do you think you
are accomplishing with this line of argument?

>So, in a last attempt to satisfy Koen, I'll rewrite this again
>to make it clear that we are talking about an approximation:
>
>   We recognize that, for many service operators, the single most
>   important aspect of the request stream is the number of distinct
>   users who have retrieved a particular entity. We believe that our
>   design provides adequate support for approximate user-counting,
>   within the constraints of what is feasible in the current Internet,
>   based on the following analysis.

Nope, that does not satisfy me.  

You are right that users = hits * X.  But measuring this X is the hard
part.  X will be different for every page and even the average X for a
site will be different depending on the type of site.  Your design
does not provide any support for measuring X.  It therefore does not
even come close to providing adequate support (however you define
support) for user counting.

>Beyond that, it would be pointless to continue subjecting the
>working group to this debate, so I won't.  If anyone wants to
>discuss this offline, I'm willing to continue it that way.

I have no particular desire to discuss this offline.  It is clear that
we are not going to agree on this point.

>On to some minor objections:
[...]

Thanks for resolving most of these minor objections.

>-Jeff

Koen.

Received on Sunday, 9 March 1997 13:33:49 UTC