RE: i69: Clarify "Requested Variant" [was: New "200 OK" status codes, PATCH & PROPFIND] from Brian Smith on 2008-02-14 (ietf-http-wg@w3.org from January to March 2008)

From: Brian Smith <brian@briansmith.org>
Date: Thu, 14 Feb 2008 06:30:22 -0800
To: "'Julian Reschke'" <julian.reschke@gmx.de>
Cc: "'HTTP Working Group'" <ietf-http-wg@w3.org>
Message-ID: <003c01c86f16$225d4280$6501a8c0@T60>
Julian Reschke wrote:
> Brian Smith wrote:
> > If Content-Location = Location (for 201), then resource being 
> > manipulated is the same as the status message (the resource 
> > describes its own status). For some methods (e.g. PUT),
> 
> The representation sent back (not the resource) is the same as the 
> status message.

To be clear, "representation" always implies "representation of the
resource," right?

> > Location is implicitly the Request-URI. If Content-Location <> 
> > Location, then the status message is a seperate resource from what 
> > was created, and a representation of that status message is 
> > available at <Content-Location>. You cannot make "status message" 
> > and "representation" mutually exclusive. Status messages can also 
> > have multiple representations if there is a Content-Location header 
> > (using normal content negotiation at the Content-Location URI).
> > And, a server cannot return a representation of the manipulated 
> > resource *instead* of a status message; if it returns a 
> > representation of the manipulated resource, that representation *is*

> > the status message.
> 
> Yes. Does this really conflict with Mark's proposal though?

The proposal was that those methods "carry a 'status message,' not a
representation." That "not a representation" phrase implies that the
status message cannot be a representation of the resource.

> >> PROPOSAL: Remove 'requested variant' from terminology and define 
> >> 'selected representation' as (roughly): "The representation that 
> >> would be returned by the server if the request method were GET, 
> >> taking into consideration selecting headers, as specified
> by the Vary
> >> header's payload."
> > 
> > If there are Accept-* headers, then those headers apply to the 
> > response body (status message), not to the POSTed/PUT/DELETed 
> > resource. For example, "Accept-Encoding: deflate" on a POST means 
> > that the server can return a compressed status message.
> 
> Well, it applies to the response, be it a representation or a status 
> message.

The response must always be a status message. The response may also be a
representation of the resource. The response can never be a
representation of the resource without also being a status message.

> > This proposal seems to be saying that those headers are used to 
> > select two representations of two different things at once: a 
> > representation of the status message, and a representation of the 
> > resource(s) that are being manipulated. The assumption is that the 
> > client will always send the same Accept-* headers for the 
> > hypothetical GET that it sent during its actual POST/PUT/DELETE. I 
> > don't think that assumption is valid.
> 
> I think it is in practice.
> 
> If content negotiation occurs, it's the server's responsibility to 
> make sure that PUT and DELETE behave in a sane way. That can be done 
> by disallowing them (just supporting them on more specific URIs), or 
> by doing the "right" thing (DELETE without Accept-Encoding also 
> removes a
> - for instance - gziped variant).

Accept-* refer always to the response entity, not to the selected
representation. (Rather, the Accept-* headers are relevant for selecting
a representation only when a representation is returned in the response
entity.) If I have a DELETE with an Accept-Encoding, that means that I
want the status message for the request to be encoded a certain way; it
doesn't mean that I want to delete the representation that has that
encoding. In order to delete a specific representation of a resource,
each representation needs to have its own Content-Location. That is why
having separate Content-Locations for variants is a SHOULD-level
requirement in HTTP.

As another example, let's say I POST an image to a photo book
application, where the photo book application returns the URL of the
photo in the Location header of a 201 response, and returns a hyperlink
to it in the response entity. In my POST, I have "Accept:
text/plain;q=0.9, text/html;q=1.0" to state I prefer the hyperlink
(status message) to be in HTML, or plain text as a fallback. However, my
"Accept" header for the image is not going to be the same; it would be
"Accept: image/jpg;q=1.0, image/png;q=0.9" if I POSTed a JPEG, and
"Accept: image/png;q=1.0, image/jpg;q=0.9" if I posted a PNG. (The
intent here is to always prefer to get back something as close to what I
posted as possible.)

As another example, I can POST a Japanese document, where the request
has a "Content-Language: jp". But, I prefer English language status
messages, so I have "Accept-Language: en". With this proposal, the
server should return the ETag for an English representation even though
I am explicitly manipulating a Japanese representation. That is
counter-intuitive. Again, I set up my client so that it always attempts
to set its "Accept-*" headers to match the "Content-*" headers of
whatever was posted, so that I am most likely to retrieve the same
representation that I posted.

It would somewhat make sense if the server did its negotiation based on
the Content-* headers, not Accept-* headers. But, you cannot put
Content-* headers in a Vary header; Content-* are all entity headers,
and Vary: can only list request headers. And, this proposal says that
the server needs to list all the headers it used to select a variant in
the Vary header.

It would be better to say that servers should not return an ETag in
response a response unless there is only one variant ("selectable
representation"?) for that resource, or unless
Content-Location=Location, which means the response entity is both the
status message and the selected representation. Otherwise, the client
needs to choose which ETag it is interested in by sending a subsequent
GET or HEAD request with headers that the server can use to select a
representation.

> > Also, the hypothetical GET would have to happen on the Location URI,

> > not the Request URI, right?
> 
> That's tricky. I'd prefer this to only be based on Request-URI and 
> request headers.

If I POST an image to an AtomPub collection, the ETag in the "201
Created" response must the one for the entry just created (in the
Location: header), not the feed document at the Request-URI.

> > When I POST with an If-Unmodified-Since or If-Match to an AtomPub 
> > collection, doesn't that mean "Process this new entry only if the 
> > collection hasn't been modified"? In other words, If-* headers must 
> > refer to the resource that is available via a GET at the 
> > Request-URI--in the case of AtomPub, the collection feed.
> 
> I think this is what the proposal says.

My concern is that the "hypothetical GET" that is part of the proposal
needs to be done against the Location: URI, not the Request URI. In the
proposal, when doing this hypothetical GET, the server would have to
consider some of the request headers, but it should ignore the If-*
headers, right? You have to partition request headers into ones that are
available for consideration when selecting a representation, and which
ones should/must not be used when selecting a representation.

- Brian
Received on Thursday, 14 February 2008 14:30:34 UTC