RE: Deprecate Content-Location? (was RE: "Variant" language in Content-Location (Issue 109))

Julian Reschke wrote:
> Brian Smith wrote:
> > Julian Reschke wrote:
> >> So if I had to rewrite this I would make it just:
> >>
> >>   The Content-Location entity-header field may be used 
> >>   to supply the resource location for the entity
> >>   enclosed in the message. 
> >>   In the case where a resource has multiple entities 
> >>   associated with it, and those entities actually have
> >>   separate locations by which they might be individually
> >>   accessed, the server SHOULD provide a Content-Location
> >>   for the particular variant which is returned.

> > 1. The message might not contain an entity (e.g. if the 
> > request is a HEAD or if the response code is 204 No Content),
> > so "entity enclosed in the message" is not really correct.
> That's true (and the same text is in 2616). Care to propose 
> something else?

If a response contains an entity with a Content-Location header, then the
Content-Location identifies a resource, and the response entity is a
representation of that resource. In responses to GET and HEAD requests the
Content-Location identifies the selected variant. 

> > 2. An entity is not the same thing as a variant, so you cannot 
> > substitute "entity" for "variant". A variant is identified by 
> > (Resource-URI, Vary) or (Resource-URI, Content-Location) 
> > and an entity is identified by (Resource-URI, ETag) or
> > (Resource-URI, Last-Modified, Content-Location). In other
> > words, there is a one-to-many relationship between variants
> > and entities (see the section that talks about 
> > invalidating cache entries based on Content-Location).
> The current plan for Issue 109 is to get rid of the term 
> "variant", and to make "representation" and "entity" 
> equivalent 
> (<>). There's a
> *separate* issue for "requested variant", which is issue 69 
> (<>).

I know. What I am trying to say is that plan doesn't work because "entity"
and "variant" mean two different things. I think "entity" could replace
"representation" but it cannot replace "variant." The language that
describes cache invalidation based on Content-Location makes it clear that
Content-Location isn't supposed to identify a specific entity. Otherwise,
Content-Location would be functionally equivalent to ETag. If you think
about it in VCS terms, then a variant is a branch of a resource, and an
entity is a particular version of a variant. At any particular time, there
can be many variants of a resource and each variant has one entity that
describes it at that point of time. Over time, a variant will be described
by multiple entity as the variant changes.

I don't really see the motivation for i109. If the specification is going to
become much clearer by such a change, then I am all for it. But, if "entity"
is going to lose its precise meaning or be overloaded with multiple meanings
then how is that an improvement?

> > 3. The RFC 2616 language says that Content-Location SHOULD 
> >    be provided in every case. Your change only says
> >    Content-Location should be required in the special case
> >    that begins "especially" in RFC 2616. So, it doesn't mean
> >    the same thing.
> Yes, that's intentional and reflects what RFC 2068 said. I 
> think the change in RFC 2616 was incorrect (overzealous 
> conversion to BCP 14 keywords).
> So do you think that servers SHOULD return a Content-Location 
> even when only one entity is associated?

IF Content-Location is going to live on then I agree that the RFC 2068-like
definition makes more sense. Basically that if the response contains a Vary
header then the entity should have a Content-Location header.

> > 4. There are important cases where the server SHOULD NOT 
> > include the Content-Location header. In reality, a server
> > can't safely include the Content-Location header unless it
> > knows that all relative URI references in the entity
> > representing the variant will resolve to the same URIs
> > that they would resolve to if the Content-Location was not 
> > provided (because almost all clients ignore Content-Location when 
> > resolving relative URIs.) Basically, it is only safe to send the 
> > Content-Location header in limited circumstances. This is 
> > contradictory to the "server SHOULD provide a 
> > Content-Location" advice.
> That sounds like a separate and new issue.

 I am not sure that Content-Location can even be defined in a useful way due
to this problem. The reason I bring it up here is that "Servers SHOULD NOT
include a Content-Location header in the response entity" reflects the lack
of support for Content-Location in most clients, but that is the opposite of
your suggested change.

> > 5. What does "might be individually accessed" mean? As far as cache 
> > validation is concerned (which is the only time Content-Location is 
> > used in the protocol itself), the Content-Location doesn't 
> have to be accessible.
> > That whole condition can be removed since it is meaningless.
> What's the point in supplying a Content-Location, if nobody 
> can access it?

Caches uses Content-Location is to identify the variant (not entity--that is
what ETag does) so that they can invalidate multiple cache entries at once.
That is the only use of Content-Location in the protocol itself. Since the
cache doesn't need to issue a request to invalidate those entries, the
resource at Content-Location doesn't have to be accessible. The only other
use of Content-Location in HTTP is to resolve relative URIs in the response
entity, and the resource doesn't have to be "accessible" for that either.

If the "might be individually accessed" condition is important then it needs
to be clarified so that implementations know what they are supposed to do to
meet the condition. But, I don't think the condition adds any value which is
why I suggest that it is simply removed.

But, again, I think we should first decide if the use of Content-Location
should ever be encouraged at all given how problematic it is.

- Brian

Received on Wednesday, 6 August 2008 13:10:43 UTC