Re: New document on "Simple hit-metering for HTTP"

Erik Aronesty:
>
>Professionals (IE: Pathfinder) no longer report things like "10K hits
>per day" to clients who pay well.  

I'm confused.  Do you mean that you no longer report hits, or that you
no longer only report hits?

>They say "we have a large
>international audience" or "we get 40% of our hits from browsers which
>support Java".
>
>Information such as "User Agent" and the clients ip address (for
>demographics) are crucial to the log reporting in the sites I have
>worked on (albeit only 6 sites). 

This is very interesting...  I wrote earlier that we need to
distinguish between two kinds of demographic data:

1) Hit counts

2) User's Referer field, IP address, User-Agent field, ...

The proposed hit counting mechanism allows you to get 1) for all user
agents without cache busting, but not 2).  You seem to predict that
most advertising sites will want to have 2) in future.  That would
make the the proposed hit counting mechanism pretty ineffective at
reducing cache busting.

On the other hand, if you gather 2) without cache busting now, and do
an extrapolation pass on the results by guessing the amount of hits
hidden by certain proxies, then the hit counting data would allow more
accurate extrapolations.  (In such an extrapolation pass, you would
assume that, as far as the headers are concerned, the requests relayed
by a proxy can be treated as a random sample of all requests made
behind the proxy.)

So my main question is: do you use cache busting to gather the 2)
statistics, and would you stop using it if the hit count proposal is
implemented?

If not, then the hit counting proposal won't reduce cache busting
much, and we would be better off with a headers-summary mechanism like
you propose:

>Perhaps the hit-metering process should allow a proxy to forward some
>sort of a headers-only-summary during a period of relative inactivity. 
>The server should not care how long it has been since the proxy has last
>sent its summary.   The "Expires" header can then still be used to
>accurately reflect the duration of the validity of the document.

Koen.

Received on Friday, 9 August 1996 16:26:22 UTC