Re: i22: ETags on PUT responses

Werner Baumann wrote:
> The use case:
> A caching WebDAV-client, that has no special knowledge about the server, 
> except that it complies to the standards.
> 
> The client uses PUT to store a file on server and holds its own copy in 
> the cache. It gets a strong etag in response from the server and 
> associates it with the cached file. The client now relies, that it can 
> use the cached file as if it was retrieved from the server via GET, for 
> any practical purpose, without causing confusion or damaging data 
> integrity.

Right now (IMHO), neither HTTP nor WebDAV give you that guarantee.

CalDAV adds a restriction that gives you that guarantee, while XCAP 
requires the opposite behavior. This is a problem when you want to use 
common libraries to access HTTP resources, or if a server wants to 
implement multiple HTTP extensions for the same URI.

That's the motivation for my proposal.

> If the client can rely on this, the server SHOULD send the etag. This 
> will be an enhancement over the current situation. If the client cannot 
> rely on this, the server MUST NOT send an etag, because the client's use 
> of the cached file might cause confusion and data loss.

That rules out lots of interesting use cases (when the storage is not 
binary-based, such as in XCAP (XML), CalDAV (calendar store), AtomPub 
(can be based ony many mechanisms...), JCR...) .

> Examples besides delete:
> - may the client display the stored file to the user, or will the user 
> be confused, when direct access to the server shows different content?

Of course that's hard for the server to know. Will an out-of-date 
revision control info confuse the user? Will reformatting of XML confuse 
the user? Will resorting lines in an ICS file confuse the user?

> - may the client edit the cached file and then upload it with a 
> conditional PUT, using that etag?

That's not a problem, as long as the entity transformation done by the 
server falls into the category described in 
<http://greenbytes.de/tech/webdav/draft-reschke-http-etag-on-write-08.html#rfc.section.1.2.p.11> 
-- are there examples for other transformations you can think of.

> - strong validators allow range-requests, even for PUT. Will this work?

Unlikely if the content was rewritten. A server can protect itself from 
that by generating the strong etag in a manner that guarantees that a 
range request will fail.

> Take examples A.3, A.4 in 
> <http://greenbytes.de/tech/webdav/draft-reschke-http-etag-on-write-08.html>, 
> where a server inserts content. Will it work, when the client inserts 
> the same content into the cached copy and then does a conditional PUT, 
> or will this end up in inserting the same content twice? If it works, 
> the etag is an improvement, if not it causes data corruption.

If the client has out-of-band knowledge how to do the transform locally, 
it would work. However it seems unwise to build a system like that, 
unless you have close coupling between client and server, such as in 
Subversion.

How would the absence of the ETag help here, btw?

> What I am missing in this statement from Julian
>  > - the presence of E in Rs does not necessarily imply that the body
>     sent with PUT was stored octet-by-octet
> is the guarantee, that the cached body is equal to what the server 
> stores for any practical use.

The problem is to define what "any practical use" means. I'm totally 
with you that with RFC2616 has to offer here isn't sufficient for 
authoring resources on servers that *do* entity transforms. Thus the 
proposed extension.

> I am maintaining a caching WebDAV-file-system (davfs2), that presents 
> WebDAV-resources to none-webdav-aware applications as a local file 
> system. What would the responsible behaviour of davfs2 be, when it gets 
> an etag in response to a PUT, but has no guarantee, that the cached body 
> is identical to the server version for any practical use? Shall it 
> present the cached body to the application and risk data corruption? It 
> must throw away the cached body and use GET (unconditional) to get the 
> real thing.

Unless it has knowledge that the cached copy is indeed identical.

> At the moment, davfs2 is not that responsible for the sake of 
> efficiency. But I would be very glad to get a reliable strong etag in 
> the PUT-response. An unreliable etag would make things worse.

Right now you just don't know. There are HTTP servers out there that 
happily return a strong etag although the content was rewritten, and 
RFC2616 (IMHO) allows that.

BR, Julian

Received on Sunday, 6 January 2008 13:21:07 UTC