- From: Koen Holtman <koen@win.tue.nl>
- Date: Fri, 17 May 1996 01:20:52 +0200 (MET DST)
- To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
- Cc: Koen Holtman <koen@win.tue.nl>
The current 03 draft has the following model how a generic resource `contains' its multiple representations: A generic resource (one subject to content negotiation) may be bound to more than one entity. Each of these entities is called a "variant" of the resource. (quoted from Section 16.5) This model necessitated the introduction of the `resource entity' concept: resource entity A specific representation, rendition, encoding, or presentation of a network data object or service, either a plain resource or a specific member of a generic resource. A resource entity might be identified by a URI, or by the combination of a URI and a variant-ID, or by the combination of a URI and some other mechanism. An plain resource MUST be bound to a single resource entity at any instant in time. We needed resource entities because _the resource entity is the unit of caching, expiration, and revalidation_. `cache slots' are assigned to resource entities. The sequence of responses from a plain resource has the same caching rules associated with it as do the sequences of responses from the different variants of a generic resource. --------- In Paris, the editorial group discussed possible ways to get rid of this `resource entity' concept in order to simplify the draft. Below, I will outline a way of getting rid of it that does not require any changes to the mechanisms defined in the spec, only to the language used to define these mechanisms. This model, by the way, closely resembles the model in the content negotiation draft. Core of the model ----------------- The basic underlying idea is that a resource, if it binds to entities, can bind to only one entity at a time. More specifically - a plain resource, if it can generate 200 responses, binds to exactly one entity at every point in time. - a generic resource binds to no entities at all. In stead, it binds to multiple plain resources which in turn bind to entities. The picture below is an example. Here, each arrow represents a `binds to' relation: -----> plain resource ---------> entity 1 / / generic resource -------> plain resource ---------> entity 2 http://x.org/paper\ \ -----> plain resource ---------> entity 3 Note that only the generic resource above is identified by a URI, the plain resources the picture above do not have their own URIs. The 1.1 draft says: resource A network data object or service that can be identified by a URI ^^^ (section 7.2). At any point in time, a resource may be either a plain resource, which corresponds to only one possible representation, or a generic resource. so there can exist resources which are _not_ uniquely identified by a URI. This is the loophole which allows the model to work. In this model, a generic resource is a `portal' through which variant resources are accessed. When a generic resource is accessed the model of `what happens' is as follows: 1) a request on generic resource is received 2) using the request, the server chooses one of the (plain) variant resources bound to the generic resource 3) the server internally redirects the request to the chosen variant resource, i.e. it generates a response message as if the request was done directly on the variant resource 4) the server _may_ add a variant-ID, which identifies the chosen variant resource, to the response message from step 3) 5) the response message is sent it to the client. If a cache receives a request on a generic resource, it will have to either reproduce the five steps above, in particular step 2, or forward the request (possibly with an If-NoMatch header) towards the origin server. The important thing to note here is that the variant resources are plain resources, so in this model, _the plain resource is the unit of caching, expiration, and revalidation_. This eliminates the need to talk about resource entities, which are the unit of caching, expiration, and revalidation in the 03 draft. So how do variant-IDs fit in? ----------------------------- (Note that the semantics for variant-IDs described below are _identical_ to those defined now in the 03 draft. Only the words of the description differ.) Though variant resources are not identified uniquely by a URI, the service author _may_ use variant-IDs to give each of the variant resources a unique identifier, being the tuple (request-URI,variant-ID): -----> plain resource ---------> entity 1 / (http://x.org/paper,"en") / generic resource -------> plain resource ---------> entity 2 http://x.org/paper\ (http://x.org/paper,"fr") \ -----> plain resource ---------> entity 3 (http://x.org/paper,"ps.en") This unique identification of variant resources has two advantages: - it allows the use of the If-NoMatch header by a cache to optimize access to the generic resource - it allows cache memory management to be more efficient Note that variant-IDs are thus only an efficiency device, they are not needed for correctness. But caches themselves are also nothing more than efficiency devices, so this is nothing new. A lazy server may choose not to generate variant-IDs, in which case there is only a many-to-one mapping from request headers sequences to variant resources: -----> plain resource ---------> entity 1 / (http://x.org/paper,req-headers-xyz) / (http://x.org/paper,req-headers-pyz) / generic resource -------> plain resource ---------> entity 2 http://x.org/paper\ (http://x.org/paper,req-headers-pqr) \ -----> plain resource ---------> entity 3 (http://x.org/paper,req-headers-abz) (http://x.org/paper,req-headers-pbz) Finally, there could be variant-IDs for only _some_ of the variant resources: -----> plain resource ---------> entity 1 / (http://x.org/paper,"en") / generic resource -------> plain resource ---------> entity 2 http://x.org/paper\ (http://x.org/paper,req-headers-pqr) \ -----> plain resource ---------> entity 3 (http://x.org/paper,req-headers-abz) (http://x.org/paper,req-headers-pbz) So how do Content-Location headers fit in? ------------------------------------------ If a response from a generic resource contains a Content-Location header, this can be seen as a statement by the author of the generic resource that the chosen variant resource has an URI that uniquely identifies it. For example, the response HTTP/1.1 200 OK ETag: "3420";"en" Content-Location: paper.en.html Content-Language: en .... evokes the following image: -----> plain resource ---------> entity 1 / (http://x.org/paper,"en") / http://x.org/paper.en.html generic resource http://x.org/paper But note that, at least under plain 1.1, the cache is _not_ allowed to just serve entity 1 (if still fresh) if a request on http://x.org/paper.en.html is made, because this would allow spoofing. The Content-Location header has to be treated as purely informational. It is intended that the http-wg will discuss, after the 1.1 draft is out, appropriate restrictions under which a cache _can_ serve entity 1 if a request on http://x.org/paper.en.html is made. Some random remarks ------------------- - whether a resource is generic or plain is a binary property. A resource may change from being generic to plain, and the other way around, at any point in time. All variant resources bound to a generic resource must be plain. - Renaming `generic resources' to `negotiated resources' is considered to be a good idea by some. - Renaming `entity tags' to `entity identifiers' is considered to be a good idea by some. If people like this model, I am willing to draft language for the 04 spec. Koen.
Received on Thursday, 16 May 1996 16:24:28 UTC