Re: improved caching in HTTP: new draft from Chris Drechsler on 2014-05-23 (ietf-http-wg@w3.org from April to June 2014)

From: Chris Drechsler <chris.drechsler@etit.tu-chemnitz.de>
Date: Fri, 23 May 2014 15:52:48 +0200
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
CC: ietf-http-wg@w3.org
Message-ID: <537F52B0.5080209@etit.tu-chemnitz.de>

Hi Poul-Henning,

thank you for your comments! My answers are below:

Am 23.05.2014 11:46, schrieb Poul-Henning Kamp:
> In message <537F016C.3030407@etit.tu-chemnitz.de>, Chris Drechsler writes:
>
>> 1) The Etag is only consistent within one domain. The SHA-256 hash value
>> in the Cache-NT header identifies the transfered representation
>> absolutely independent of the used URLs (and therefore across domains).
>
> I don't think this would work as you suspect.
>
>
> Case 1:  SHA-256 input is the body of the object.
> -------------------------------------------------

No, not exactly. The SHA-256 value is being computed over that 
representation of the resource which would be send by the server to the 
client in case of a successful response with status code 200 OK and 
before applying Content-Encoding and/or Transfer-Encoding (these can be 
undone).

> That means you cannot generate this header until you have generated the
> entire object, ruling out progressive and optimistic delivery.

Right, you need the entire object. For "static" content this is no problem.

> It's also not enough I belive, you may need to stick some of the
> HTTP headers into the hash too, to get the expected behaviour.
>
> Transfer-encoding, Content-type ?

Why do you believe in this - can you explain it in more detail?

As I see it: caching should/must ensure that the client will get exactly 
what the origin server has sent. The SHA-256 identifies the content in 
the body of the response which is coming from the origin server. The 
cache does the following:
-it uses the locally stored body (which fits to the hash value)
-applies Content-Encoding, Content-Range, Transfer-Encoding if present 
in the header of the origin server
-send this together with the header of the origin server to the client

So the client gets exactly the same response like the response which was 
coming from the origin server.

> This may not matter in anybodys actual reality, but I think it would
> violate some of the many decorative rococo features in the RFCs.
>
>
> Case 2:  You allow the content owner to define what goes into the SHA256
> ------------------------------------------------------------------------

No, I don't allow the content owner to define what goes into the SHA256. 
It's clearly defined how the hash value should be computed.

> Now you just lost the "globably unique" property because everybody
> and his nephew are going to do use SHA256("FOOBAR" + URL) because
> it's the cheapest.  A lot of them will not even understand why
> "FOOBAR" would be necessary.
>
> You need to construct the object identifier from a FQDN name and a
> site-controlled nonce:
>
> 	Cache-Key: FQDN ":" <nonce_max_128_hex_chars>
>
> Which FQDN the origin organization decides to use is entirely their
> own choice, as long as it's theirs to control.  Likewise, the nonce
> can be anything they care to use.
>
> And that however opens the door wide to cache-poisoning, since you
> cannot really check that the FQDN is legit, without starting some
> kind of X.509 certificate song and dance...
>
>
> I don't dispute the validity of the use-case you're trying to handle,
> but I have a hard time seeing it pay off.
>

Chris

Received on Friday, 23 May 2014 13:53:31 UTC