Re: Content-Integrity header from Ludin, Stephen on 2012-07-10 (ietf-http-wg@w3.org from July to September 2012)

From: Ludin, Stephen <sludin@akamai.com>
Date: Tue, 10 Jul 2012 13:33:52 -0500
To: Amos Jeffries <squid3@treenet.co.nz>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <CC21C046.5A7F4%sludin@akamai.com>
I really like the idea of placing the Digest in the chunk trailers.  Being
able to calculate these digests on the fly and not buffer the entire
message is critical in my opinion.

Another concept that I have been playing with is providing digests on
individual chunks using chunk-extension.  The rational for this is for
very large objects.  With per-chunk digests the client would have the
ability to re-request a specific corrupted section of an object using a
range request rather than the entire object.  This can have enormous
perceived performance and reliability benefits for consumers of things
such as software download and large media files.

I was working on a draft to propose this, but I didn't feel it was well
baked enough to share.  If there is interest in this type of functionality
I will polish it up and post it.

One issue to point is is that for these types of "frame" based integrity
checks I generally feel like we are reinventing the content integrity
portion of SSL/TLS.  Though I see the value in begin able to do this apart
from SSL it forces the question at what point do you just switch over to
SSL to get the desired functionality?

-stephen



On 7/9/12 4:00 PM, "Amos Jeffries" <squid3@treenet.co.nz> wrote:

>On 10.07.2012 07:08, HAYASHI, Tatsuya wrote:
>> +1
>>
>> I know that this is demanded.
>> When I discussion about http-authentication and phishing,
>> it is requested by many people.
>> It is a difficult problem.
>> ex) proxy...
>>
>> I think that it is good to do this discussion now.
>>
>> ---
>> Tatsuya
>>
>> On Sat, Jul 7, 2012 at 8:23 AM, James M Snell wrote:
>>> In general, I'm +1 on the general idea albeit with a few caveats...
>>>
>>> 1. To minimize complexity, only a single Content-Integrity header
>>> should be used. I don't want, as Roy points out, to have to iterated
>>> through a bunch of unsupported header values looking for the one I
>>> want. Just as it makes very little sense for an implementor to
>>> provide
>>> multiple Last-Modified, Etag and Content-Type headers in a single
>>> message; there should be only a single Content-Integrity statement
>>> and
>>> I either understand it or I don't.
>
>Either the client advertises what it supports (opening itself to
>middleware erasing options they can't modify). Or the server uses
>multiple algorithms in hopes that the middleware cannot violate them
>all.
>It makes perfect sense to have several levels of integrity check. MD5,
>SHA1, AES in one response and allow the client to validate the strongest
>it can handle.
>
>There is also an arguable case for middleware wanting to add its own
>hash to inform the client essentially "this is what I got given". So the
>point of manipulation can be back-traced when the more secure end-to-end
>checks fail.
>
>If you want end-to-end integrity, don't stop at half measures.
>Particularly at half measures which can be corrupted.
>
>
>>>
>>> 2. The performance impact of calculating the digest needs to be
>>> carefully considered. I'd rather not be required to buffer a full
>>> representation in memory all the time just to calculate a header
>>> value. I know it's largely unavoidable, but perhaps there's some
>>> currently elusive solution that can be considered. For instance..
>>> allowing Content-Integrity to appear as a trailer at the end of a
>>> chunked response.
>
>As has been said Trailers happen here.
>
>>>
>>> 3. Something needs to be said about what happens if the
>>> Content-Integrity check fails. For instance, if a request containing
>>> Content-Integrity is sent to the server and the server detects that
>>> the signature is invalid, what should happen? what must happen?
>>> Likewise, how are intermediaries expected to treat the
>>> Content-Integrity header given that any intermediary is able to
>>> modify
>>> the payload at any time?
>
>This is going to be most useful on request/responses sent with
>"no-transform" of course.
>  If the integrity was only a MD5 or SHA1 hash which middleware can edit
>easily there is no end-to-end integrity, just hop-by-hop integrity.
>
>
>Also, there has to be a mutual secret between origin server and client.
>Without that, when integrity is compromised the transforming hop will
>simply erase or replace the Content-Integrity header value. A secret key
>unknown to that middleware is required to make the integrity hash break
>when it tries this.
>
>AYJ
>
Received on Tuesday, 10 July 2012 18:34:41 UTC