- From: Jamie Lokier <jamie@shareable.org>
- Date: Fri, 2 May 2008 19:55:40 +0100
- To: Henrik Nordstrom <henrik@henriknordstrom.net>
- Cc: Werner Baumann <werner.baumann@onlinehome.de>, ietf-http-wg@w3.org
Henrik Nordstrom wrote: > Wait a minute here. The "weak" ETags generated by Apache isn't that > weak. For Apache in default configuration to generate the same weak ETag > for two different versions the following need to all be true > > - The update was done in-place by overwriting parts of the file, > preserving the same inode number. Or the inode is recycled when the old file is deleted and a new one created. > - The update MUST be within the same sub-second as the previous update Or the clock moves backwards due to a correction, or someone writes a similar file using a timestamp-preserving copy (like cp -p, rsync -t). NB: Both of these break strong Etags for updates _not_ in the last second too - Apache's algorithm is not watertight. > - The update MUST NOT change the file size. Quite common, when editing a file of the same name. > - The inode change timestamp must also not change by the update. If the modification time is in the same second, you can be quite confident the change timestamp will be in the same second too. > This can practically only happen if there is other processes updating > the file content directly outside Apache. That's probably the most common way files in Apache are updated. > The reason why Apache sends weak ETag on content modified in the last > second is because the default configuration assumes there will be other > processes running on the server "randomly" overwrite parts of published > files many times within the same second. Or non-randomly. Every way I've seen files updated and published through Apache other than WebDAV (FTP, rsync, scp) can trigger these problems occasionally. Clearly, it does screw up range requests. But also: after doing an update, then you run a client to GET the file, perhaps to verify it's serving the right content, you expect to get the file you have just updated. If weak Etags are used in caching, this is not guaranteed any more. > For normal HTTP use where updates is done using PUT, or nearly all > normal edits the above isn't true and Apache may just as well send a > strong ETag without any loss of guarantee. That's right. I believe it can be configured to do so, if you can confirm all these guarantees. Dynamically generated content from databases, blogs, wikis etc. can also use strong Etags in the same way for precise cache validation, if you can confirm the tags precisely. So can backends serving ordinary files modified by other processes, if you can use something like Linux's F_LEASE or inotify to be informed of file changes synchronously. But these aren't the simplest of configurations. -- Jamie
Received on Friday, 2 May 2008 18:56:15 UTC