W3C home > Mailing lists > Public > ietf-http-wg@w3.org > April to June 2008

RE: ETags and concurrency control

From: Henrik Nordstrom <henrik@henriknordstrom.net>
Date: Fri, 02 May 2008 19:21:13 +0200
To: Brian Smith <brian@briansmith.org>
Cc: "'Robert Siemer'" <Robert.Siemer-httpwg@backsla.sh>, "'Pablo Castro'" <Pablo.Castro@microsoft.com>, atom-protocol@imc.org, "'HTTP Working Group'" <ietf-http-wg@w3.org>
Message-Id: <1209748873.17447.28.camel@henriknordstrom.net>

On ons, 2008-04-30 at 21:46 -0700, Brian Smith wrote:

> AFAICT, The issue with i101 is that some servers do not general
> *useful* weak ETags; at least as Apache seems to just generate weak
> ETags that it will never match (which is inefficient but not totally
> broken).

This is a bug in Apache, not the specifications. There is an open bug
report on this, and has been acked by several Apache developers. But
it's a very minor one, based on the simple fact that Apache when using
the default filesystem based store can not guarantee there is no other
process messing with the filesystem content at the same sub-second.
Additionally current releases allow the administrator to tune this,
enabling the server to generate strong ETags immediately (must only be
enabled if one knows there is no other processes modifying the files in
"bad ways")

Other Apache backends using weak etags can handle them fine.

> To me, "semantically equivalent" is something that is definitely
> vague. However, the use of strong ETags for range requests makes
> things clearer: If you can guarantee that your server will always
> generate a byte-for-byte identical representation (usable for range
> requests) for a given ETag, use a strong one; otherwise, if you think 
> that generating an ETag makes sense at all, use a weak one.

Exactly.

And if you can guarantee (within reasonable doubt) thatno two
"significantly different" objects share the same etag always send an
etag. You should only skip ETag if you strongly suspect that you may
assign the same etag to two versions which is significantly different.

The typical case for strong/weak etags is 

Strong:
- Guaranteed octet equality.

Weak:

- Dynamically generated content froma versioned / controlled or static
source, where some parameters such as the generation timestamp, or the
exact formatting of the response may differ from request to request
based on parameters which for some reason can not be included in the
ETag generation algorithm used by the server.

- When updates isn't critical and it doesn't matter at all if someone
uses a version which is somewhat dated. I.e. A pageview counter image.

> For example, mod_deflate should never return a strong ETag because the
> entity it generates is dependent on the system configuration. With a
> strong ETag, mod_deflate needs to ensure that the ETag changes
> whenever the mod_deflate configuration changes and whenever the
> system's zlib changes; with a weak ETag, it could continue to ignore
> these little details (like it does now when generating strong ETags.)

With this reasoning pretty much nothing except for a static file store
can return strong ETags. I disagree. Servers like mod_deflate SHOULD use
the system configuration parameters as input to their ETag generation
algorithm if these are likely to change during the lifetime of an
object.

> In fact, I would say that weak ETags should be the default choice, and
> strong ETags should only be used when the application has specifically
> ensured that there is a one-to-one correspondence between the ETag and
> the byte stream that comprises the entity--in other words, only use
> strong ETags when you could support range requests (whether you
> support range requests or not). The restriction against using weak
> ETags in PUT and DELETE requests forces applications to use strong
> ETags in situations where they are not guaranteeing this one-to-one
> correspondence.

To this I can agree.

> The alternative would be to deprecate weak ETags and discourage their
> use, weaken the definition of strong ETag to match what weak ETags
> were original for, and then say that strong ETags are only guaranteed
> to have a one-to-one correspondence with entities if the server
> supports range requests for that resource. That seems to be the
> effective result of the proposed i101 resolution.

This won't fly as the server has no control over who makes ranges of the
response. It may well be an intermediary proxy server between the client
and server.

Regards
Henrik
Received on Friday, 2 May 2008 17:22:01 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:50:47 GMT