Re: Idempotent partial updates from Adrien de Croy on 2012-02-29 (ietf-http-wg@w3.org from January to March 2012)

From: Adrien de Croy <adrien@qbik.com>
Date: Thu, 01 Mar 2012 11:34:56 +1300
To: Amos Jeffries <squid3@treenet.co.nz>
CC: ietf-http-wg@w3.org
Message-ID: <4F4EA810.6040500@qbik.com>
On 28/02/2012 2:56 p.m., Amos Jeffries wrote:
> On 28.02.2012 13:54, Adrien de Croy wrote:
>> I wasn't talking so much about partial updates using PUT, but the
>> general reliance on the concept of idempotence at all.
>>
>> Various features in HTTP, such as pipelining, recovering from closes
>> etc rely on certain methods being idempotent.  I would propose one can
>> never assume any method is idempotent, since that is up to server-side
>> implementation.
>
> It is required for cacheability assumptions on the response in absence 
> of explicit "Cache-Control: public". Since almost no sites explicitly 
> send "public" we can't exactly erase that property without causing a 
> whole lot of pain to a lot of network administrators.
>
> An HTTP bandwidth increase of between 20% and 50% on certain Tier-1 
> network pipes is at stake and not something to play around with.
>
> Intermediary admin already have to violate the specs to a certain 
> degree, ignoring no-cache to reduce bandwidth costs wrongly imposed by 
> broken server libraries which insist on sending no-cache and no-store 
> on static content.

Last time I sampled Cache-control response headers (over couple million 
hits crawling sites), I found a large majority use it to prevent 
caching.  Very few to enable it.  It's a shame.

so moving from a naive HTTP/1.0 style cache to a compliant HTTP/1.1 
style cache actually resulted in a huge reduction in cache utility.  
Without ignoring cache-control directives as you say, it's hard to get 
more than a 10% effective bandwidth benefit from caching, which frankly 
is not worth the pain.

But at least Cache-control as you say is explicit, and doesn't rely on 
any assumptions based purely on the method about whether it is indeed 
safe or not to retry.

So I guess non-compliant websites end up seeing the effect of this when 
clients assuming idempotence of a method cause unwanted side-effects in 
the site, and that's the only real incentive the web developers have to 
fix the non-idempotence.

I wonder whether this could be better explained to web devs.  I can just 
imagine the blank looks "idem-what?".  Or indeed if it's really only a 
problem of my imagining :)

>
>>
>> Whether in practise PUT is usually actually idempotent or not I don't
>> have any information.
>>
>> But my feeling is there is a disconnect between HTTP and those making
>> websites in this regard.
>>
>
> IME the disconnect is happening around about the library/framework/SDK 
> layer. 

yeah I agree.  Like CMS packages that take the easiest route to dealing 
with potential caching issues instead of using good design.  Even 
revalidation is preferable to regeneration and retransmission.

Regards

Adrien

> The documentation for those is spares on HTTP details, and some of 
> their defaults are actually causing violations of the HTTP specs. This 
> thread on PUT with partials being a case in point. PHP insistence on 
> sending prohibitive cache-controls by default has also led to many 
> broken web apps. ASP has similar problems. the litany of 
> implementation compliance bugs and side effects is _long_.
>
> AYJ
>
>

-- 
Adrien de Croy - WinGate Proxy Server - http://www.wingate.com
WinGate 7 is released! - http://www.wingate.com/getlatest/
Received on Wednesday, 29 February 2012 22:35:24 UTC