Re: Our old friends, weak ETags

In message <7D2A809E-F3E0-4128-AFEA-8EEEBF3AC26E@mnot.net>, Mark Nottingham wri
tes:

>Strictly speaking, I don't think this is a problem for caches; following 
>the rules for reusing a stored response 

I can confirm:  We have had a heck of a time to find out how/what to
do about eTags when Varnish gzip's objects to save bandwidth.

I think the poster-boy example we came up with was this:

Request:
	(routed directly to server)
	GET /bla, range=0-10kbyte, A-E=gzip
Response:
	10 kbyte of uncompressed bla, E-tag=foo

Request:
	(routed to cache, which holds gzip'ed copy of object)
	GET /bla, range=10-20kbyte, A-E=gzip, If-Match=foo
Response:
	(fetches object from server (E-tag=foo))
	(gzips to save bandwidth)
	10 kbyte of gzip'ed bla, [...]

It is our interpretation that current semantics demand that the
cache modify the ET when it does the gzip'ing of the object.

But it is not at all obvious how it can safely do so, since that
ET can end up back on the origin server if the third request
gets routed directly to the server.

Absent knowledge about how the server produces ETs, the
cache as no way of knowing how to produce a non-colliding one.

My conclusion was that ETs were specified wrong with respect to
range and gzip:  ETs should always refer to the complete underlying
objects bits in raw form, and not be affected by range or compression.

If that was the case, the client would not send A-E=gzip on
the second request, having gotten uncompressed bits back on
the first and there would be no issue.

I'd love to be told there are a simpler solution...

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

Received on Friday, 20 July 2012 07:49:04 UTC