- From: Koen Holtman <koen@win.tue.nl>
- Date: Sun, 16 Feb 1997 20:52:10 +0100 (MET)
- To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
- Cc: Koen Holtman <koen@win.tue.nl>
I just read the new hit metering draft, and spent some time reading through the archived discussions on previous metering drafts. I like the new material that was added in this draft. However, I note that the basic mechanism has not changed, nor have the claims made about it. So the problems I had with the previous draft still exist. On a micro level: I listed some (fixable) technical problems with the previous draft in http://www.ics.uci.edu/pub/ietf/http/hypermail/1996q4/0294.html As far as I can see, most of these problems have not been fixed in the new draft. Did the message above somehow drop out of the author's editorial queue? On a macro level: 1) Section 4 says: `We believe that our design provides adequate support for user-counting, based on the following analysis.' I do not think it does (for a longer discussion, see the article linked above), and as long as this claim stays in, I won't support the draft. Many people want web metrics better than what have now, but this draft does not provide such metrics. Vendors say they get pressure for better metrics from their customers; I don't think implementing this draft will make the pressure go away. 2) I feel that there is too much unnecessary cruft in the draft. The usage limiting stuff should be removed, and the special rules for varying resources should probably also be removed. The stickiness and header compression (header abbreviation) should largely be cut -- this stuff just generates a lot of code, and the efficiency savings in no way compensate for the efficiency loss due to the extra requests and the cache busting outside of the metering subtree. We'll have stickiness and header compression as a general mechanism in HTTP/2.0, or http-ng, or whatever. I see no reason to introduce this stuff for some specialised header beforehand. 3) To quote Roy Fielding: >The other harm I mentioned is the implicit suggestion that "hit-metering" >should be sanctioned by the IETF. It should not. Hit metering is a way for >people who don't understand statistical sampling to bog down all requests >instead of just those few requests needed to get a representative sample. >Whether or not some ISP customers want it does not change the fact that >it is damaging to the community as a whole, and it's a lot better to inform >people on how not to be a "scum sucking pig" than it is to have a proposed >standard on slightly-less piggish ways to be a pig. I feel that the IETF should not sanction this form of hit metering (by making it a proposed standard) _unless_ it can be shown that not doing so will lead to an internet meltdown. I don't think this has been shown, and I think that the evidence so far is actually to the contrary. I read through a lot of discussions about this in the archives. To summarise: - for this discussion, cache busting means making the user agent do a conditional get every time, after which the server usually sends a 304 (not modified). - estimated cache busting levels are ~30% - 0.0001% (also depends on whether you count unintentional cache busting) - other reasons for cache busting include - stupidity / laziness / inertia (CGI's and server side includes both lead to cache busting in the default case) - working around broken browser features - sites which require statefulness/authentication - showing a different ad each time - gathering hit count demographics - gathering demographics better than just hits (though the draft does have a mechanism for gathering more than hits, the cache efficiency of this mechanism is not much better than using plain cache busting in my assessment) - it is unknown how large a fraction of cache busting is done only to get hit counts. - it is unknown whether the people doing cache busting to get hit count demographics now can be educated to use friendlier statistical methods (the authors of the draft seem to assume that not many can be) - if the draft is adopted, some people who will do cache busting now will switch to the hit counting methods in the draft. However, others who don't count anything now may start using the draft, and this leads to _more_ cache busting outside of the metering subtree. Due to this last point, we don't even know if the overall effect of implementing the draft will be good or bad! Also, adopting the draft may slow the introduction of a better demographics system later. A better system does not necessarily have to be based on HTTP extensions either. However, I have no high hopes of a better system happening very soon, or at all. The social issues that need to be resolved are even more complex than the technical issues. In summary: I don't support this draft going to proposed standard. I _might_ support it as an experimental RFC if 1) and 2) above are resolved. Koen.
Received on Sunday, 16 February 1997 12:01:53 UTC