RE: i107 from Brian Smith on 2008-07-31 (ietf-http-wg@w3.org from July to September 2008)

From: Brian Smith <brian@briansmith.org>
Date: Thu, 31 Jul 2008 17:37:16 -0500
To: "'Henrik Nordstrom'" <henrik@henriknordstrom.net>
Cc: "'HTTP Working Group'" <ietf-http-wg@w3.org>
Message-ID: <04ADD9A7E7B8456BAAD47F536ED95331@T60>
Henrik Nordstrom wrote:
> On tor, 2008-07-31 at 12:14 -0500, Brian Smith wrote:
> > I assume by "the resource" you mean the resource at the original 
> > request URI, not the resource at the Content-Location URI.
> 
> Content-Location IS the specific URI to this resource. Or if 
> you like the non-negotiated direct access URI to a specific 
> variant of a resource.

> > RFC 2616 doesn't say that the client may delete any of the 
> variants of 
> > http://example.org/foo.html by sending a DELETE request to 
> > http://example.com/foo.html.
> 
> To me it's exacly what it says. Quote from p3 6.7 Content-Location:
> 
>    The Content-Location value is not a replacement for the original
>    requested URI; it is only a statement of the location of 
>    the resource corresponding to this particular entity at the
>    time of the request.

example.org says that the entity that it is returning in the response is
available at the time of the request at http://example.com/foo.html, which
is not even the same domain. RFC 2616 doesn't say that must be a true
statement; in fact, in another part the specification makes it quite clear
that we shouldn't make any such assumptions.

>    Future requests MAY specify the Content-Location URI as 
>    the request-URI if the desire is to identify the source
>    of that particular entity.

That *doesn't* say that editing the resource at the Content-Location URI
will have any effect on any variant of the resource identified by the
original request URI. 

> > an extension to HTTP. In any case, there may be more than 
> > one variant of the resource at the Content-Location URI,
> > so Content-Location doesn't even address a specific
> > variant of *any* resource.
> 
> No. The Content-Location URI is not supposed to be a 
> negotiated resource. If it is then the server implementation is wrong.

The specification doesn't say that anywhere, although I agree that some of
the wording kind of hints to it. In practice protocols like AtomPub that
place extra meaning when Location=Content-Location mean that
Content-Location will often reference a content-negotiated resource (again,
because of Vary: Accept-Encoding) which is beyond the application's control
(again, because of things like encoding proxies or mod_deflate).

> If the server can not provide unique URIs to each variant 
> then no Content-Location should be returned.

The specification says only the converse of that:
   
   A server SHOULD provide a Content-Location for the
   variant corresponding to the response entity; especially in
   the case where a resource has multiple entities associated
   with it, and those entities actually have separate locations
   by which they might be individually accessed, the server
   SHOULD provide a Content-Location for the particular
   variant which is returned.

In particular, the specification never says that the server SHOULD NOT or
MUST NOT provide a Content-Location header when the predicate is false.
AtomPub and other similar servers will often return a Content-Location
header that doesn't uniquely identify a variant as I mentioned above.

> > It is unrealistic to assume that we never PUT to negotiated 
> > resources because many (on some servers, all) resources are 
> > negotiated as  "Vary: Accept-Encoding" at least.
> 
> Indeed. The catch is that implementations of dynamic 
> content-encoding hasn't really followed the entity variant 
> model of HTTP that well.. so the Content-Location model fails 
> for such setups. And clients have to work around it by not 
> using Accept-Encoding if they want to be able to update the
> resource.

The Content-Location model fails because it requires a trust model that HTTP
leaves undefined.

> > If a server changes (or
> > deletes) the uncompressed variant of the resource upon a PUT
> > (or delete) but leaves the compressed variant as-is, then I 
> > would say that server is definitely wrong, unless there is
> > some explicit indicator in the request that says that only
> > some specific variants are to be modified. But, HTTP doesn't
> > provide any such indicator.
> 
> Try it and you will find it's exacly what servers using 
> separate storage for both the compressed and identity encoded 
> variants is doing. But in such setups you often MUST use the 
> Content-Location URIs on updates or negotiation will get 
> disabled for the resource.

Negotiation getting disabled is acceptable even if it is sub-optimal.
Continuing to serve an old variant alongside a new one is not okay. 

> If you think of Content-Language then it makes more sense.

I disagree. If the goal is to edit each language variant independently, then
you need some mechanism to identify the individual variants as separate
resources. HTTP itself doesn't define any mechanism for doing that; you need
some extension. That extension might be an extension that simply says
"Editing the resource at the Content-Location returned for X variant of
resource Y will update that variant of resource Y" and with a trust model
added. But, it is not something that exists in HTTP now.

> Content-Encoding is a bit of a special case here and not a good
> Content-* header to base any reasoning on, especially not 
> considering how it's most commonly implemented..

Again, I disagree. The specification doesn't give any special treatment to
Content-Encoding. I understand where you are coming from, but I don't think
anybody here wants to define a whole set of rules for Content-Encoding
separate from every other entity header. (Right?) That is why I use
Content-Encoding instead of Content-Language in all my examples. It is the
most common case by far. It is also the one that causes the most problems
when one tries to define a mechanism for editing individual variants,
because any mechanism that doesn't allow the server to keep the unencoded
and encoded variants in sync by default can be shown to be obviously
defective.

But, it is easy to show that the same holds for Content-Type too. Let's say
I have a resource http://example.org/my-girlfriend with image/png and
image/jpeg variants, and I issue this request:

PUT /my-girlfriend HTTP/1.1
Content-Type: image/jpeg
Accept: image/jpeg
...

It would be very surprising for the server to return a picture of my old
girlfriend for this request:

   GET /my-girlfriend HTTP/1.1

while simultaneously returning a picture of my new girlfriend for this
request:

   GET /my-girlfriend HTTP/1.1
   Accept: image/jpeg

That is why I am so strongly advocating a requirement that PUT and DELETE
requests on a resource should operate on all variants of the resource by
default, unless the request somehow explicitly indicates otherwise. At the
very least, servers must be *allowed* to operate that way, if they are not
*required* to do so.

Also, keep in mind that there is no requirement for the server to return a
Vary header field on a negotiated response unless the response is cacheable.
Thus, the client generally has no way of knowing what resources are even
content-negotiated.

Regards,
Brian
Received on Thursday, 31 July 2008 22:38:02 UTC