- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Tue, 16 Apr 96 15:55:09 MDT
- To: koen@win.tue.nl (Koen Holtman)
- Cc: http-caching@pa.dec.com
Since Jim wants me to nail down as much as possible of the caching stuff TODAY, and I need to have something concrete about variant-IDs to do this, I'm going follow the plan described in this message. This is not subject to debate today. Once Jim issues his draft, then anyone who wants to can reopen the discussion. I'm happy to accept suggestions for minor corrections, but I don't have time to get drawn into philosophical arguments. First, Koen started this thread with the following statement: I have argued before that the requirement Varying resources (even those that are transparently negotiated) MUST send responses which include variant-IDs. which is present in both Jeff's If-Valid/If-Invalid/Cval text and Roy's competing If-EID/Unless-EID/EID text must be dropped, because this requirement means unnecessary trouble for transparent content negotiation. I cannot find this "requirement" in my drafts, and cannot recall having made it. I've always considered variant-IDs optional for the server, since they are meant solely for improving performance. Roy (or perhaps someone before him) defined two basic kinds of negotiation: Preemptive negotiation, in which the client expresses preferences in a request on a resource, and the origin server chooses the most appropriate entity and returns that Reactive negotiation, which is signalled somehow by the client [draft-ietf-http-v11-spec-01.txt says by an empty Accept header, but this may have changed]. The origin server in this case returns a description of the choices, and then the client chooses one and makes a specific request for that entity. We can consider these as two different ways to select the right variant: "origin server selects" and "end-user client selects". Or, in another terminology, we could say that in preemptive negotiation, the server is the "selecting participant" and the client is a "non-selecting participant"; in reactive negotiation, the client is the "selecting participant". In a world without caches or proxies, that's it. Now we introduce *non-caching* proxies. I'll start by thinking of a model with a single proxy (i.e., not local to the client or origin server). As far as I can tell, there are two ways that this proxy can participate in the selection process: selection-transparent: the proxy does not participate in the selection process; it simply ships requests and responses back and forth between client and origin server. selecting-participant: the proxy uses reactive negotiation with the origin server, and preemptive negotiation with its client. That is, the selection point is moved part-way from the origin server to the end-user client. One might be tempted to apply some sort of symmetry argument that says that a proxy could engage in preemptive negotiation with the origin server and reactive negotiation with the client, but I think this is a false symmetry: the proxy in this case would not necessarily have enough information about the resource to allow the end-user client to do reactive negotiation. In other words, the point at which the selection is done (origin server, proxy, or client) must have full information about the available choices. My belief is that Koen intends this information to be conveyed by the Alternates header. I'm basing this belief on the draft-holtman-http-negotiation-00.txt document; I'm not sure if any of his more recent messages have changed this. Now we introduce caching. Because caches store copies of specific entity instances, not of resources, a cache needs to use a specific entity identifier as a cache key; a URI for a varying resource is not a sufficient cache key. We could construct an entity (variant) identifier in one of several ways: (1) a URI that is bound to a specific entity (variant). (2) a URI bound to the varying resource, plus a set of selection criteria that is guaranteed to completely determine the variant. (3) a URI bound to the varying resource, plus an opaque variant ID. #1 is conveyed by the Alternates header, apparently, so it is available to (and usable by) the selecting participant. However, a client doing preemptive negotiation does not have the specific URI, so a cache located at such a client has to use some other means of identifying the variant (i.e., either #2 or #3). #2 is, as far as I can tell, what Koen was trying to propose with his "structured variant IDs." There seems to be a generally negative reaction to this; people don't want to insist that proxy caches understand enough of the selection algorithm to make this the main mechanism in HTTP/1.1. (I think most people are willing to believe that something like this could be made optional.) There's also the problem that if the origin server uses a selection criterion that cannot be expressed using the Alternates header (e.g., "the user's birthdate is a prime number"), then this simply doesn't work. #3 seems to work pretty well. In this approach, the origin server MAY (not MUST) provide a variant-ID with any entity-instance that it returns in a response. If a cache receives a variant-ID, it can do two things with it: (1) It can use it to replace an existing cache entry for the same variant. That is, it forms a cache key using the URI of the request and the variant-ID of the response. If this key matches the key of an existing cache entry, it can replace the existing entry with the new response (subject to all of the other rules on caching). (2) It can use it, together with a cache validator, in a conditional request to inform the server that it already has the associated entity-instance in its cache. The triple (URI, variant-ID, cache-validator) forms an identifier for a specific entity-instance (or set of instances, if the validator is weak). This allows the server to return 304 (Not Modified) knowing that the cache will understand which variant is being referred to. Note that this mechanism is entirely orthogonal to the selection process. That is, the variant-selection process does not use the variant-ID information; it is only used after the selecting participant decides on the appropriate variant, and then needs to know if entity body should be transferred or not. Since the cache in this case may not know which variant is going to be selected, it should send all of its (variant-ID, validator) pairs for the resource, and let the selecting participant choose the right one. This is what the variant-set mechanism is used for. (Note that this adds some interesting flexibility to the variant selection algorithm; the selecting participant knows what is in the cache, and if there is no overriding selection criterion, it might "choose" a variant that is already in the requestor's cache, rather than an equally useful one that is not in that cache.) If the origin server does not want to provide variant-IDs, it does not have to. However, in this case it becomes extremely hard (may impossible) for a non-selecting participant to do conditional retrievals, because it can't tell the selecting participant the precise criteria that led to the creation of the cache entry. For the time being (i.e., in HTTP/1.1), only the origin server can assign variant-IDs, because otherwise we have no way to prevent two selecting-participant caches from assigning the same variant-ID to two different variants of a resource. (I think this was another purpose of Koen's structured variant-IDs; if the caches have full knowledge of the selection criteria, they could assign non-conflicting variant IDs by a canonical representation of the criteria used.) However, Koen might like this compromise: we allow (but do not require) the origin server to embed variant-identification information in the opaque validator itself. (We do NOT allow the cache to look at this embedded information!) Such a validator is marked with the suffix "/S" (for "Selecting"). Then if a cache has received a variant with an opaque validator but without a variant-ID, it can still perform a conditional retrieval on the resource. However, the origin server will only provide a 304 (Not Modified) response if it is using this kind of opaque validator; otherwise, it must treat the request as unconditional. And an intermediate cache can respond to this kind of conditional request (one without a variant-ID) if it has this kind of "Selecting" validator, and if it exactly matches the validator of one of the cache's entries for the resource. Stay tuned for a draft from Jim (once he gets one from me ...) -Jeff
Received on Tuesday, 16 April 1996 23:13:13 UTC