RE: ETags and concurrency control from Brian Smith on 2008-05-01 (ietf-http-wg@w3.org from April to June 2008)

From: Brian Smith <brian@briansmith.org>
Date: Wed, 30 Apr 2008 23:46:33 -0700
To: "'Pablo Castro'" <Pablo.Castro@microsoft.com>, "'Adrien de Croy'" <adrien@qbik.com>
Cc: "'Henrik Nordstrom'" <henrik@henriknordstrom.net>, "'Robert Siemer'" <Robert.Siemer-httpwg@backsla.sh>, <atom-protocol@imc.org>, "'HTTP Working Group'" <ietf-http-wg@w3.org>
Message-ID: <005f01c8ab57$18b15b40$0202a8c0@T60>

Pablo Castro wrote:
> 
> Putting the caching issue aside for a second

Once you add the caching issue back in, the use of weak ETag equivalence for the types of things you describe below doesn’t make sense. When multiple representations share a common weak ETag, that means there is no useful difference between them, so it doesn't matter which representation the client has, and it doesn't matter which representation is returned to the client. In the situation you describe below, the client is likely to be very interested in those secondary bits of information, even if it doesn't intend to (or can't) update them.

> think about the 
> case of side-effecting operations. You have some version of a 
> resource you got from a previous GET. Now you want to update 
> it only if the version on the server is consistent with the 
> version you originally got (arguably you made your 
> modification decisions based on that state). At that point 
> you'd issue a PUT or DELETE with an if-match header 
> containing the value you got in the ETag. If an aspect of the 
> resource that does not affect its semantics (as defined by 
> the author of the service) has changed, a weak ETag would 
> allow the PUT to go through successfully even if the resource 
> wasn't bit-by-bit identical.

Agreed.

> On the Astoria side of things we ran into this often. When 
> people expose resources made of data coming from databases, 
> not all the columns in a record matter (some are redundant, 
> some are de-normalized data just carried around, some are 
> non-comparable such as blobs or xml cells). You still want 
> concurrency checks on the rest in those cases.

The data is that is redundant, denormalized, or difficult to compare is still semantically significant; otherwise, why would the service return it in the first place? All of that data needs to be taken into consideration when generating the ETag, whether it is weak or strong. 

It sounds like you are trying to solve the "partial update" problem that is often discussed on rest-discuss and elsewhere. IMO, the best solution to that for the situation you describe is to give the updatable part of the resource its own URI with its own separate ETag(s) so you can PUT and DELETE at that URI using normal HTTP semantics. After all, the client and the server already need to have a shared understanding of what is updatable and what isn't. That would make your use of PUT and DELETE fit in very well with commonly accepted practice. Doing otherwise seems likely to result in Astoria making it way too easy to create services that simply don't work (especially in conjunction with caching proxies).

Cheers,
Brian

Received on Thursday, 1 May 2008 06:47:10 UTC