RE: i69: Clarify "Requested Variant" [was: New "200 OK" status codes, PATCH & PROPFIND] from Brian Smith on 2008-02-14 (ietf-http-wg@w3.org from January to March 2008)

From: Brian Smith <brian@briansmith.org>
Date: Thu, 14 Feb 2008 08:56:42 -0800
To: "'HTTP Working Group'" <ietf-http-wg@w3.org>
Message-ID: <004c01c86f2a$93ec0fd0$6501a8c0@T60>
Julian Reschke wrote:
> Brian Smith wrote:
> > Accept-* refer always to the response entity, not to the selected 
> > representation. (Rather, the Accept-* headers are relevant for 
> > selecting a representation only when a representation is 
> returned in 
> > the response
> 
> Disagreed.
>
> > entity.) If I have a DELETE with an Accept-Encoding,
> > that means that I want the status message for the request
> > to be encoded a certain way; it doesn't mean that I want
> > to delete the representation that has that encoding. In
> > order to delete a specific representation of a resource,
> 
> It seems to me that questions like these are the ones we want 
> to answer in the context of this issue.
> 
> Does it make a difference in practice? How do you distinguish 
> between request headers that affect the selected 
> representation, and those that only affect status messages?

I base my statement on "The Accept request-header field can be used to
specify certain media types which are acceptable for the response" and
many other similar statements in sections 14.1-5. Clearly, the intent is
that the Accept-* headers always specify the client's preferences
regarding the response entity. 

Is this set of proposals intended to define how to edit a specific
representation of a content-negotiated resource? I think it is pretty
clear how it works when each separately-editable representation has its
own URI (see below about Content-Location). Why isn't that good enough? 

Example: A resource http://example.org/foo has two representations: one
application/pdf, and one text/html. The client wishes to modify the PDF
representation. However, it cannot display PDF content in its error
dialog boxes, so it wants the status message to come back as text/html.
Under this proposal, the client would have to specify "Accept:
application/pdf" in order to edit the PDF representation. However, this
also indicates that it wants a PDF response entity, when really it
doesn't want that at all. If it has "Accept: text/html" then, under this
proposal, the server may interpret that as a request to edit only the
HTML representation. 

It is also confusing what "selected representation" means with POST. Is
the "selected representation" a representation of the resource at the
Request-URI, or is the "selected representation" a representation of the
resource that was created? It seems like you always intend for it to be
the former, but there are many applications that expect the Accept-*
headers to refer to the latter when Location=Content-Location.

Also, if I POST with "Accept: image/jpg;q=1.0, image/png;q=0.9" to an
AtomPub collection, should I expect the AtomPub collection to return a
"406 Not Acceptable" response, since the collection does not have a JPEG
or a PNG representation?

> > That is why having separate Content-Locations for variants
> > is a SHOULD-level requirement in HTTP.
> 
> I don't think it is; all I can see is a "SHOULD send 
> Content-Location when separate URI exists".

I stand corrected. In practice, however, this is basically a requirement
for resources with distinctly editable representations, because the only
defined content negotiation mechanism is for response entities, and the
client would not be able to determine ahead of time whether the server
supports this new proposed mechanism.

> > As another example, let's say I POST an image to a photo book 
> > application, where the photo book application returns the 
> > URL of the photo in the Location header of a 201 response, and 
> > returns a hyperlink to it in the response entity. In my POST, I 
> > have "Accept: text/plain;q=0.9, text/html;q=1.0" to state I 
> > prefer the hyperlink (status message) to be in HTML, or plain 
> > text as a fallback. However,  my "Accept" header for the image
> > is not going to be the same; it would  be
> > "Accept: image/jpg;q=1.0, image/png;q=0.9" if I POSTed a JPEG, 
> > "Accept: image/png;q=1.0, image/jpg;q=0.9" if I posted a PNG.
> > (The intent here is to always prefer to get back something as 
> > close to what I posted as possible.)
> 
> But the URI you POST to, and the photo URI are distinct; so I 
> don't see why that's surprising.

I misunderstood what you intended "selected representation" to mean for
POST. In particular, I thought "selected representation" referred to the
representation whose ETag is returned in a "201 Created" response.

> > As another example, I can POST a Japanese document, where 
> > the request has a "Content-Language: jp". But, I prefer English 
> > language status messages, so I have "Accept-Language: en". With 
> > this proposal, the server should return the ETag for an English 
> > representation even though I am explicitly manipulating a 
> > Japanese representation. That is counter-intuitive. Again,
> > I set up my client so that it always attempts to set its 
> > "Accept-*" headers to match the "Content-*" headers of whatever 
> > was posted, so that I am most likely to retrieve the same 
> > representation that I posted.
> 
> OK, so tying the ETag to Location for 201 would fix this. 
> That may be the right thing to do, but I'd still like to 
> leave the definition of "selected representation" as simple 
> as possible.

ETag and Location are already tied together: "A 201 response MAY contain
an ETag response header field indicating the current value of the entity
tag for the requested variant just created." But, if multiple variants
were created (e.g. compressed and uncompressed), then the client doesn't
know which variant the ETag refers to. That was why I thought that
"selected representation" meant something different for POST.

> > It would be better to say that servers should not return an
> > ETag in a response unless there is only one variant
> > ("selectable representation"?) for that resource, or unless 
> > Content-Location=Location, which means the response entity 
> > is both the status message and the selected representation.
> > Otherwise, the client needs to choose which ETag it is
> > interested in by sending a subsequent GET or HEAD request with 
> > headers that the server can use to select a representation.
> 
> Nice in theory. In practice, many resources have at least two 
> representations (the raw one, and with Content-Encoding: gzip).

PROPOSAL: When Location=Content-Location, the ETag must be the one for
the representation returned as the response entity. Otherwise, the
server may return the ETag for any representation of the resource; the
client cannot determine which representation that ETag is for, and
should do a HEAD or GET to retrieve the ETag for the representation it
is interested in, if it needs a specific one.

> > If I POST an image to an AtomPub collection, the ETag in the "201 
> > Created" response must the one for the entry just created (in the
> > Location: header), not the feed document at the Request-URI.
> 
> Understood. So minimally in this case, the ETag is not for 
> the selected representation.

> > My concern is that the "hypothetical GET" that is part of 
> > the proposal needs to be done against the Location: URI,
> > not the Request URI. 

> Hm, why would it need to be done against anything except the 
> Request-URI? Me confused :-)

I was thinking "selected representation" referred to the representation
of the newly-created resource that had the ETag given in response. If
that is not what "selected representation" refers to, then maybe there
needs to be another new term coined.

> 
> > proposal, when doing this hypothetical GET, the server 
> would have to 
> > consider some of the request headers, but it should ignore the If-* 
> > headers, right? You have to partition request headers into 
> ones that 
> > are available for consideration when selecting a 
> representation, and 
> > which ones should/must not be used when selecting a representation.
> 
> Which one would not be used for the selection?

Again, this comes from my misunderstanding of what was meant by
"selected representation" for POST.

- Brian
Received on Thursday, 14 February 2008 16:56:54 UTC