RE: i107 from Yves Lafon on 2008-08-01 (ietf-http-wg@w3.org from July to September 2008)

From: Yves Lafon <ylafon@w3.org>
Date: Fri, 1 Aug 2008 08:26:41 -0400 (EDT)
To: Brian Smith <brian@briansmith.org>
Cc: 'Henrik Nordstrom' <henrik@henriknordstrom.net>, 'HTTP Working Group' <ietf-http-wg@w3.org>
Message-ID: <Pine.LNX.4.64.0808010804220.19931@ubzre.j3.bet>
On Thu, 31 Jul 2008, Brian Smith wrote:

>
> Henrik Nordstrom wrote:
>> On tor, 2008-07-31 at 12:14 -0500, Brian Smith wrote:
>>> I assume by "the resource" you mean the resource at the original
>>> request URI, not the resource at the Content-Location URI.
>>
>> Content-Location IS the specific URI to this resource. Or if
>> you like the non-negotiated direct access URI to a specific
>> variant of a resource.

+1 (and btw, editing http://www.example.com/bar/ is quite painful as most 
servers won't give a CL with the "non-negotiated" URI (in this case it 
not really conneg)).

>>> RFC 2616 doesn't say that the client may delete any of the
>> variants of
>>> http://example.org/foo.html by sending a DELETE request to
>>> http://example.com/foo.html.
>>
>> To me it's exacly what it says. Quote from p3 6.7 Content-Location:
>>
>>    The Content-Location value is not a replacement for the original
>>    requested URI; it is only a statement of the location of
>>    the resource corresponding to this particular entity at the
>>    time of the request.
>
> example.org says that the entity that it is returning in the response is
> available at the time of the request at http://example.com/foo.html, which
> is not even the same domain. RFC 2616 doesn't say that must be a true
> statement; in fact, in another part the specification makes it quite clear
> that we shouldn't make any such assumptions.

Not the same domain... Is that very different if the URI in CL was 
http://example.com/bar/foo.hml ? Are you sure that if the URI are in the 
same domain, the URI is controlled by the same group of people? The same 
domain assumption works somehow in practise, but is in theory plain wrong.

>
>>    Future requests MAY specify the Content-Location URI as
>>    the request-URI if the desire is to identify the source
>>    of that particular entity.
>
> That *doesn't* say that editing the resource at the Content-Location URI
> will have any effect on any variant of the resource identified by the
> original request URI.

Nothing is said about _editing_ the resource, note that there is an issue 
as well with validators (ETag being a response header and not an entity 
one, see attempts with structured ETags [RFC 2295] etc...)

>>> an extension to HTTP. In any case, there may be more than
>>> one variant of the resource at the Content-Location URI,
>>> so Content-Location doesn't even address a specific
>>> variant of *any* resource.
>>
>> No. The Content-Location URI is not supposed to be a
>> negotiated resource. If it is then the server implementation is wrong.
>
> The specification doesn't say that anywhere, although I agree that some of
> the wording kind of hints to it. In practice protocols like AtomPub that
> place extra meaning when Location=Content-Location mean that
> Content-Location will often reference a content-negotiated resource (again,
> because of Vary: Accept-Encoding) which is beyond the application's control
> (again, because of things like encoding proxies or mod_deflate).

(let's digress on the use of Accept-Encoding/Content-Encoding when 
TE/Transfer-Encoding is better suited)

Suppose you want to edit http://exampe.com/bar/foo.html it is served using 
mod_deflate, so an ETag of "aofidrjfoer"-gzip is added (yuck, but at least 
it's different than the non-compressed version, even if this is not valid)
And you have a nice Content-Encoding: gzip.
Now edit, (wash, rince,) save... Without a real CL, you will save a 
compressed version to an originally  non compressed resource. Not a big 
deal, as the server should DTRT, store the compressed content and adjust 
its metadata. But some other people, editing directly on the file system 
will find it surprising to edit binary data instead of HTML.

Ok, so add a CL in this case ? Ok, PUT will be done on 
http://example.com/bar/foo.html.gz, then you didn't change the original 
one and have two different document (with different content) served using 
conneg on Accept-Encoding.

I am still wondering why support for TE/Transfer-Encoding is so weak...
(end of digression ;) )

>> If the server can not provide unique URIs to each variant
>> then no Content-Location should be returned.
>
> The specification says only the converse of that:
>
>   A server SHOULD provide a Content-Location for the
>   variant corresponding to the response entity; especially in
>   the case where a resource has multiple entities associated
>   with it, and those entities actually have separate locations
>   by which they might be individually accessed, the server
>   SHOULD provide a Content-Location for the particular
>   variant which is returned.
>
> In particular, the specification never says that the server SHOULD NOT or
> MUST NOT provide a Content-Location header when the predicate is false.
> AtomPub and other similar servers will often return a Content-Location
> header that doesn't uniquely identify a variant as I mentioned above.
>
>>> It is unrealistic to assume that we never PUT to negotiated
>>> resources because many (on some servers, all) resources are
>>> negotiated as  "Vary: Accept-Encoding" at least.
>>
>> Indeed. The catch is that implementations of dynamic
>> content-encoding hasn't really followed the entity variant
>> model of HTTP that well.. so the Content-Location model fails
>> for such setups. And clients have to work around it by not
>> using Accept-Encoding if they want to be able to update the
>> resource.
>
> The Content-Location model fails because it requires a trust model that HTTP
> leaves undefined.

Well, what kind of trust model would work? Are you trusting the header and 
content you receive from an origin server, if not sent using HTTPS?

> Content-Encoding instead of Content-Language in all my examples. It is the
> most common case by far. It is also the one that causes the most problems
> when one tries to define a mechanism for editing individual variants,
> because any mechanism that doesn't allow the server to keep the unencoded
> and encoded variants in sync by default can be shown to be obviously
> defective.

As I said above the main issue is that CE is used when TE is really meant.

> But, it is easy to show that the same holds for Content-Type too. Let's say
> I have a resource http://example.org/my-girlfriend with image/png and
> image/jpeg variants, and I issue this request:
>
> PUT /my-girlfriend HTTP/1.1
> Content-Type: image/jpeg
> Accept: image/jpeg
> ...
>
> It would be very surprising for the server to return a picture of my old
> girlfriend for this request:
>
>   GET /my-girlfriend HTTP/1.1
>
> while simultaneously returning a picture of my new girlfriend for this
> request:
>
>   GET /my-girlfriend HTTP/1.1
>   Accept: image/jpeg
>
> That is why I am so strongly advocating a requirement that PUT and DELETE
> requests on a resource should operate on all variants of the resource by
> default, unless the request somehow explicitly indicates otherwise. At the
> very least, servers must be *allowed* to operate that way, if they are not
> *required* to do so.

The existence of variants are orthogonal to the fact that server have 
content negociated resource.

If you have /my-gf.png and /my-gf.jpeg, your intent is probably to do 
fancy transcoding to update one when the other is edited, because you 
know that there is a semantic link between the two URIs, and the link is 
not the fact that /my-gf is a conneg resource. In your example, doing a 
PUT on /my-gf /my-gf.png or /my-gf.jpeg should in theory lead to the same 
thing, update all versions, but it is way beyond what HTTP gives you, it's 
server side logic for this particular set of resources.
Cheers,

-- 
Baroula que barouleras, au tiéu toujou t'entourneras.

         ~~Yves
Received on Friday, 1 August 2008 12:35:24 UTC