- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Tue, 04 Mar 97 17:57:57 PST
- To: Koen Holtman <koen@win.tue.nl>
- Cc: http-wg@cuckoo.hpl.hp.com
Catching up on my old email ... Koen seems to have two major objections to the current draft of the hit-metering document: (1) he believes that the introduction of hit-metering will decrease, not increase, the amount of caching that takes place. I.e., it will increase, not decrease, total traffic. (2) he objects to "_any_ positive claims about the relation between hits and users". Koen also has some minor objections, most of which I am happy to resolve. I'll address these objections point-by-point, but first I want to make one thing clear: where the difference between our positions is a matter of opinion which cannot be resolved by existing data, then a negative opinion about a proposed protocol specification (especially a fully optional extension) is not sufficient reason to kill the specification. To give an example from outside the HTTP-WG: suppose I were to state an opinion that the use of MD5 is a bad for the default authentication algorithm in IPv6, because it is too slow for some environments ... and suppose someone else stated that MD5 is the right choice for a default, because it is fast enough for almost all environments, and there is no widely available alternative. No matter how much we debate this, we could not resolve it by debate, because the hypothesis (that MD5 is "too slow" for "too many" aplications) can only be tested by trying it. Getting back to the case at hand: Koen's says his "main efficiency worry" is: If many people, who do not use cache busting now, start using hit metering, then caches _outside_ of the metering subtree will have to do much more revalidation requests, which means more RTT overheads and a slower web. The reality of this worry depends on several sub-hypotheses: (1) The availability of hit-metering will induce a large number of sites that do not now do cache-busting to use hit-metering. [Otherwise, hit-metering could not increase the amount of cache-busting in any subtree, metering or otherwise.] (2) Hit-metering will be deployed in proxies in a pattern that will lead, in many cases, to a metering subtree that does not extend to, or near, the last hop to the clients. [Otherwise, there won't be many "caches outside the metering subtree(s)".] (3) If both #1 and #2 hold, any improvement in caching within the metering subtree (which would be due to the use of hit-metering by servers that would otherwise use cache-busting anyway) is not enough to offset the new cache misses outside the metering subtree. In order to quantify this, one has to consider a lot of parameters, such as the fanout of responses within and outside the metering subtree, the network delays (bandwidth, speed-of-light, and queueing) within and outside the metering subtree, the sizes of caches inside and outside the subtree, and assumptions about the use of other protocol features (such as persistent connections). [In other words, this is pretty hard to quantify using a model.] (4) If #1, #2, and #3 all hold, cache operators won't react to the increased revalidation traffic by dropping out of the metering subtree (i.e., by disabling hit-metering), ultimately pushing the burden of handling the revalidations back to the origin servers (which would provide an incentive for them to stop the use of hit-metering). I will happily grant that if these four hypotheses are all true (and more than just slightly true), then hit-metering would be a bad thing for the network. But it should also follow that Koen must admit that if some or all of these sub-hypotheses (especially #1 and #2) are false, then hit-metering would not cause performance problems, and (if they are false in a big way) then hit-metering would provide a definite improvement. Further, these sub-hypotheses are all beliefs about how large numbers of humans (server operators and proxy operators) will react in a complex and evolving environment. It would be foolish of me to assert that I could prove or disprove any of them with the information we have available today, and I frankly don't expect anyone else to be able to prove or disprove them (based on current information). So it basically comes down to making guess about these hypotheses, about the worst-case, best-case, and likeliest scenarios, and (whether we act or not) taking a risk that we're making the wrong choice. While Koen writes "Not doing anything is sometimes the most logical course of action", I don't think he has made a strong case that "not doing anything" about cache-busting is actually our best bet. On to Koen's other major complaint. Section 4 of the hit-metering draft is labelled "Analysis", and starts: We recognize that, for many service operators, the single most important aspect of the request stream is the number of distinct users who have retrieved a particular entity. We believe that our design provides adequate support for user-counting, based on the following analysis. After a complaint from Koen, I revised the second sentence so that this now reads: We recognize that, for many service operators, the single most important aspect of the request stream is the number of distinct users who have retrieved a particular entity. We believe that our design provides adequate support for user-counting, within the constraints of what is feasible in the current Internet, based on the following analysis. Note that this language is NOT a part of the specification per se, and is heavily qualified; we use phrases like "for many service operators", "we believe", "based on the following analysis". This is NOT a statement of fact, it's an opinion, and clearly labelled as such. Koen responds: [I] can think of several currently possible techniques, most of them involving actual statistical methods, which would would be more accurate. Bottom line: I want you to stop making _any_ positive claims about the relation between hits and users. I'd surely like to see a well-defined description, including some analysis, of these other possible techniques, and perhaps James Pitkow's paper (when it becomes available) will shed some light. But I'm not interested in continuing a debate of the form "I know a better way to do this, but I'm not going to provide a detailed argument about why it is better." As for making "positive claims" about the relation between hits and users: It would be ludicrous to suggest that there is no correlation between the number of hits seen when using hit-metering and the number of distinct users, and so I am not even going to accuse Koen of suggesting that. It is debatable whether the correlation is exact (i.e., that hit-metering gives a user-count with 0% error); in fact, as we state later in the analysis, hit-metering gives an approximation, and we clearly state that "there are some circumstances under which this approximation can break down." In other words, we are clearly making a claim about the quality of the approximation, nothing more. We also observe that existing techniques (either cache-busting or full caching) can, and usually do, give much worse approximations than hit-metering would. Koen has not argued this point. So, in a last attempt to satisfy Koen, I'll rewrite this again to make it clear that we are talking about an approximation: We recognize that, for many service operators, the single most important aspect of the request stream is the number of distinct users who have retrieved a particular entity. We believe that our design provides adequate support for approximate user-counting, within the constraints of what is feasible in the current Internet, based on the following analysis. Beyond that, it would be pointless to continue subjecting the working group to this debate, so I won't. If anyone wants to discuss this offline, I'm willing to continue it that way. On to some minor objections: Please rename the proposal `Simple Hit-Metering and Usage-Limiting for HTTP' That makes sense; done. >As for section 8, "Interactions with varying resources": this simply >states the bare minimum necessary to make sensible use of the Vary >mechanism as it is currently defined in the HTTP/1.1 RFC. Section 8 is not minimalist, it maximises the info! Smaller solutions, which still make sense to me, are: 1) not counting for each combination of request headers, but for each entity tag only 2) counting for each content-location only 3) only one count for the entire resource Our section 8 is minimal in the complexity it imposes on the proxy implementation; we made no claims about how much (or how little) information it provides to the origin server, although it manifestly provides no more information than cache-busting would. It seems to me that counting hits per-entity-tag is pretty much equivalent to counting them per Vary-defined variant. After all, if the server is using strong entity tags, then there is a one-to-one mapping between variant responses and entity-tags. And it seems like a bad idea to use the same weak entity tag for two different variants of the same resource, since this makes it impossible for a proxy cache to reliably do conditional GETs. An HTTP/1.1 cache must either not cache a response with a Vary header (in which case hit-metering does not apply), or it must comply with Vary, which means keeping at least a minimal amount of per-entry data. It does not require keeping per-content-location data, so basing hit-metering on content-location seems like a burden on cache implementors. Keeping one count for the entire resource would be easily to implement, but it would probably encourage servers providing variant responses to use cache-busting instead of hit-metering. Which gets back to the original argument about whether or not they would use cache-busting, and so "see above". Like Ted, I _am_ concerned that introducing a single sticky header will reduce the usable lifetime of HTTP/1.x as a protocol suite. You can only add so much features before you drown in feature interactions. Some of this is already visible in your section 5.5. Well, there's one simple requirement in section 5.5 (which describes how a non-caching proxy can forward Meter headers in both directions without creating inaccuracy) that depends on this issue. And if the protocol did have a general-purpose sticky-header mechanism, it would also create interactions in section 5.5 (additional ones, in fact). But the stickiness of the Meter request header is, admittedly, a fairly small optimization. The statistics from your 1996 message show a mean request header size of 200 bytes (if I interpret that message correctly). I looked at a recent proxy trace with several hundred thousand references, and found a mean request header size of 305.5 bytes. Your analysis also suggested that, using persistent connections, the mean client-server connection would carry 9.2 requests. It's reasonable to assume that a proxy-to-server persistent connection, which might be reused before it closes, would carry at least that many. Making the Meter header per-request (rather than per-connection) would, in simplest (and hopefully, most common) case, involve sending 8 bytes per request. This is a 4% overhead using your 200-byte mean, or a 2.6% overhead using the mean that I measured. If the Meter request header is sticky, and assuming 9.2 requests per connection, then the overheads would drop to 0.43% or 0.29%, depending on which mean header size one uses. Whether or not the Meter header is sticky, the "Connection:Meter" header is per-connection, so that's a constant in this analysis; about 2 bytes per mean request, or a 1% overhead. A 4% "extra" overhead is probably just below the threshold where we should begin to be concerned. This is a dangerous argument to make, though, if we make it 10 times independently. It might be OK to drop the sticky-header mechanism from the hit-metering draft, but I would caution against using this analysis to justify not adopting a general sticky-header mechanism for HTTP. Anyway, I'll drop the sticky-header stuff from the hit-metering proposal IF nobody objects to the extra overhead in the requests. And if it leads to at least one of the current critics turning into a supporter. Regarding prefetching, Koen wrote: I believe your system will always count a page _pre_fetch done by a proxy as a hit, no matter whether the page is actually seen by a user later. You need to fix this. I responded: [The] entire issue of prefetching in HTTP needs some more thought. [...] The hit-metering proposal might need to be fixed, but the general problem is broader. to which Koen wrote: I am aware that there is a broad problem, but you need to fix this sub-problem nevertheless. Just define some flag that the cache can include if it does a prefetch. I'm aware that this does not solve the problem of prefetching outside of the metering subtree. After giving this "more thought" (but not nearly enough), I think we probably *will* have to define some way to flag a request as a prefetch ... but I do NOT think that this flag should be specified in the hit-metering proposal. It would be a bad situation to end up with several different ways of flagging prefetches. I will add something to the hit-metering proposal to say that once a prefetch-flag is defined, then a proxy that both uses hit-metering and generates prefetches MUST use that flag. I'll address TCN in a separate message, since this isn't an issue where there is a real difference of opinion. -Jeff
Received on Tuesday, 4 March 1997 18:12:55 UTC