Re: Summary of ETag related issues in RFC2518bis from Lisa Dusseault on 2005-12-20 (w3c-dist-auth@w3.org from October to December 2005)

From: Lisa Dusseault <lisa@osafoundation.org>
Date: Tue, 20 Dec 2005 10:01:22 -0800
To: "Dan Brotsky" <dbrotsky@adobe.com>
Cc: <w3c-dist-auth@w3.org>, "Geoffrey M Clemm" <geoffrey.clemm@us.ibm.com>
Message-Id: <e9dc8bd57812bb9eee7cf4d6e559cca9@osafoundation.org>
So when is it OK for a client doing multiple edits to do several PUTs 
in a row without intermediate GET requests?

I'll take the example of a source code file which is being edited, and 
the source control server expands keywords in comments in the file.  
The client software could issue a PUT and get back an ETag, then when 
later changes are made, issue another PUT with If-Match with the ETag 
just received.   Each time, the server could expand the keywords, 
without harm done from the client "losing" the server's changes.

In this case, multiple PUT without intermediate GET is OK to do -- the 
server is prepared to make the same changes on each PUT and doesn't 
really need the client to re-synchronize their changes.  There are 
other cases like some CalDAV cases where the server adds an internal 
event identifier or alternate address to the event.  I'd also bet that 
there are clients that already do multiple PUT requests without 
intermediate GETs especially if the client holds a lock.  But if in 
some cases the server needs the client to do a GET between two 
subsequent PUTs because the changes are important to preserve, how can 
the server accomplish that?

I believe there is a way without any additional mechanisms:

  - If the server is making changes that can be overwritten without 
harm, or if the server is making no changes, it can return an ETag in 
response to PUT and the client doesn't have to do a GET unless it later 
sees a different ETag.

  - If the server is making changes that must be preserved, then the 
server can respond to the initial PUT with a throwaway ETag, then 
immediately update the ETag of the resource to a new and more permanent 
value.  Now the client will be forced to recognize that there are new 
changes to be synched -- just as if another client had made the change 
in that period of time.  Most clients would already be compliant with 
this.

If we decided to make this kind of recommendation, we'd also have to 
specify whether it's OK to do this while the client is holding a LOCK.

Lisa

On Dec 19, 2005, at 9:09 PM, Dan Brotsky wrote:

>
> Geoff,
>
> I don't follow your reasoning here when you say "the client will
> incorrectly conclude that the text it sent with the PUT is
> what would be retrieved by the GET."  It seems like there are three
> cases:
>
> 1. The server modifies the value "on the way up", that is, before
> returning from the PUT.  (This is typically how a version control 
> system
> would expand keywords, as part of the checkin.)  In this case the value
> that would eventually be retrieved by GET is known and thus its etag 
> can
> be returned, even if that etag is a timestamp.
>
> 2. The server returns before modifying the value, but knows that it 
> will
> do so.  In this case a synthetic value for the etag can be generated 
> and
> returned, as long as the server takes steps to make sure that etag is
> returned with the eventual GET and all GETs requested before the
> modifications are complete are blocked (e.g., with "server busy").  
> This
> etag can still be a timestamp, by the way, and can even be a timestamp
> of the checkin, as long as the server associates that time with the
> eventual result (which version control systems also typically do).
>
> 3. The server returns before modifying the value, and doesn't know that
> a modification will take place.  (For example, the "type" of the file 
> is
> later changed so that the file undergoes keyword expansion later.)  In
> this case, at the time the file is modified by the server, it should
> assign a new etag, because indeed the etag returned at the time of the
> PUT should not match what a client would eventually GET.  But before
> that later modification is done, the etag is correct.
>
> In no case does a client ever assume that "the text it sent with the 
> PUT
> is what would be retrieved by the GET."  That's not what the etag is
> for.  The etag is to reassure the client that the value on the server
> *has not changed since the PUT completed*.  No guarantees are issued
> that the value doesn't change as part of the PUT; that would be a part
> of the PUT semantics for that server and are outside the scope of
> WebDAV.
>
>     dan
>
>
>
> ________________________________
>
> 	From: w3c-dist-auth-request@w3.org
> [mailto:w3c-dist-auth-request@w3.org] On Behalf Of Geoffrey M Clemm
> 	Sent: Monday, December 19, 2005 19:47
> 	To: w3c-dist-auth@w3.org
> 	Subject: Re: Summary of ETag related issues in RFC2518bis
> 	
> 	
>
> 	Jim:
> 	
> 	What about the point made by an earlier poster, namely that
> 	a server is allowed to modify the content stored by a PUT,
> 	so that a GET following the PUT might return different content
> 	than was PUT (the earlier poster gave the example of a server
> 	that expands RCS keywords on PUT).
> 	
> 	In this case (i.e. the server modifies the content stored by
> 	the PUT), if server returns the etag that would be returned
> 	on a GET, and the client requests a GET with an If-None-Match
> 	header with the etag returned by the PUT, the client will
> 	incorrectly conclude that the text it sent with the PUT is
> 	what would be retrieved by the GET.
> 	
> 	So unless we are going to disallow servers from modifying the
> 	content stored from a PUT (note that our server does not do
> this,
> 	so I am speaking as a neutral party here :-), we pretty much
> 	have to have PUT return the entity tag of the content that was
> 	PUT, not what would be returned by the GET.
> 	
> 	Then a client that wants to continue modifying a resource to
> 	which it has just done a PUT, would need to do a GET with
> 	an If-None-Match call following the PUT, to handle servers
> 	that do this kind of rewriting on PUT.
> 	
> 	Note that this is just a single GET, not to be confused with
> 	the "polling" scenario described in "promotion from weak to
> 	strong etag" thread.
> 	
> 	Cheers,
> 	Geoff
> 	
> 	
> 	Jim wrote on 12/19/2005 09:11:02 PM:
> 	>
> 	> Julian,
> 	>
> 	> Thanks for making this more clear -- you're right, there is a
>
> 	> significant issue here.
> 	>
> 	> > The question here is whether an ETag returned upon PUT is
> for the
> 	> > entity the client sent (1), or for the entity the server
> would send
> 	> > upon a subsequent GET (2).
> 	> >
> 	> > There are cases where both will not be the same, so this
> needs to
> 	> > be clarified. In case of (2), a client will need a
> subsequent GET
> 	> > if it's planning to use the ETag for subsequent GET/Range
> requests.
> 	> >
> 	>
> 	> I think option #2 is the best one here (the Etag returned by
> PUT is
> 	> the one a subsequent GET would retrieve).
> 	
> 	
>
>
Received on Tuesday, 20 December 2005 18:01:33 UTC