Re: [Fwd: Re: PUT vs strong ETags]

Wilfredo Sánchez Vega wrote:
> On Dec 7, 2005, at 5:49 PM, Jim Whitehead wrote:
> 
>> I don't have a firm read on whether the Apache HTTPd behavior is 
>> non-compliant. However, I strongly feel that its current behavior is 
>> very bad for clients.
>>
>> What clients really need is for the ETag to not change unless the 
>> content changes.
> 
>   Well, I'd argue that what they *really* need is for the ETag to change 
> if the content changes.

+1

>   While I agree than changing the ETag for otherwise-same content is 
> inconvenient and best avoided, unless it's happening a lot (it this case 
> it happens at most once), I don't see that as tragic.

I note that I changed my own opinion several times now. One possible 
interpretation of Apache's behaviour is that in the case where content 
indeed changes several times within a second, a client may *want* not to 
have to re-sync until the resource is stable again. This is achieved by 
the current implementation.

>   Note that this situation only arises during the one-second span of 
> time that a resource is modified.  That does mean that it's almost 
> always going to yield the "temporary" ETag on a PUT request, but rarely 
> on a GET.
> 
>   As long as httpd is using a filesystem timestamp to compute the ETag, 
> this is going to be unavoidable.  An MD5 hash would be a great ETag, but 
> it, and anything that involves opening the file, is far more expensive 
> to compute.  I think the trade-off here is a reasonable one, given the 
> data we have available.  The other option is to punt and not emit an 
> ETag, since we arguably don't have accurate enough information.  But I 
> think that would be worse.
> 
>   Anyway, whether it changes or not is unrelated to the issue I'm 
> angling for, which is whether the use of a weak etag is wrong.
> 
>> So, if an ETag starts out weak, it should stay weak. If it starts 
>> strong, it should stay strong. The weak->strong transition is a 
>> problem. I don't think clients care particularly about weak vs strong 
>> etags. In fact, if the server does some amount of background 
>> processing on the content, but returns a semantically equivalent 
>> representation then the server should be using weak etags 
>> consistently, AFAIK. For example, if a CalDAV server receives an 
>> event, bursts it out into its database, then reconstructs a slightly 
>> different XML representation that is semantically equivalent, it 
>> should only be using weak etags in this case to represent the fact 
>> that there are minor tweaks to the representation of the resource.

I think I disagree here. A few days ago I asked about this on the HTTP 
mailing list, and the consensus seems to be:

- just because a server accepts a PUT and returns a (strong) ETag 
doesn't mean that it didn't rewrite the contents

- the ETag is *not* for the entity body returned with PUT, but for the 
entity you would get upon a subsequent GET/HEAD

- and yes, RFC2616 needs to be clarified

(see thread 
<http://lists.w3.org/Archives/Public/ietf-http-wg/2005OctDec/thread.html#13>)

>   OK, that's a vote for "the weak etag is wrong".

My take is that the current drafts requirements for strong ETags and for 
ETags returned upon PUT are questionable. I think it has been 
demonstrated that

1) in some cases, weak ETags are just fine, and that

2) a requirement to return an ETag upon PUT will *need* to also clarify 
what that means

The latter optimally would be an erratum to RFC2616, which well require 
a discussion over on the HTTP mailing list, with proposed text changes 
and consensus among the readers of the list (I'm *not* volunteering to 
do this because of other priorities).

Best regards, JUlian

Received on Thursday, 8 December 2005 13:04:25 UTC