Re: Content-Integrity header from Yutaka OIWA on 2012-07-11 (ietf-http-wg@w3.org from July to September 2012)

From: Yutaka OIWA <y.oiwa@aist.go.jp>
Date: Wed, 11 Jul 2012 14:04:54 +0900
To: "Ludin, Stephen" <sludin@akamai.com>
Cc: Amos Jeffries <squid3@treenet.co.nz>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <CAMeZVwt6PoZSaWT-4PmZ8cSO7JzuetfzW7BgiCOp4jE=xsMvwA@mail.gmail.com>
Another reason to provide per-chunk integrity is to allow
clients to progressively use the content while receiving.
Progressive handling of large objects is common in
current Web systems, especially for large resources
like texts (HTMLs, PDFs), images, movies, and others.
Without HTTP-level integrity protection, it works well.
(TLS provides checking integrity of partial stream.)

However, if integrity protection is provided by
either a trailer or a header, clients must firstly store all
received data without further processing, check integrity,
and then decide whether to accept.

To this purpose, chunk-based integrity protection is
even useful for static contents for which hash values can be
pre-computed.

"Re-inventing the wheel" issue is still alive :-)

P.S.
As a personal feeling, the value of integrity protection on
this moment is more on protection against intentional content-forging
attacks, rather than unintentional communication failure
(except premature termination of streams).
To this extent, re-requesting the broken chunk is personally out of
my interest.  This may need discussion, because it may affect
the design of inter-chunk chaining of integrity signatures.

2012/7/11 Ludin, Stephen <sludin@akamai.com>:
> I really like the idea of placing the Digest in the chunk trailers.  Being
> able to calculate these digests on the fly and not buffer the entire
> message is critical in my opinion.
>
> Another concept that I have been playing with is providing digests on
> individual chunks using chunk-extension.  The rational for this is for
> very large objects.  With per-chunk digests the client would have the
> ability to re-request a specific corrupted section of an object using a
> range request rather than the entire object.  This can have enormous
> perceived performance and reliability benefits for consumers of things
> such as software download and large media files.
>
> I was working on a draft to propose this, but I didn't feel it was well
> baked enough to share.  If there is interest in this type of functionality
> I will polish it up and post it.
>
> One issue to point is is that for these types of "frame" based integrity
> checks I generally feel like we are reinventing the content integrity
> portion of SSL/TLS.  Though I see the value in begin able to do this apart
> from SSL it forces the question at what point do you just switch over to
> SSL to get the desired functionality?
>
> -stephen
>
>
>
> On 7/9/12 4:00 PM, "Amos Jeffries" <squid3@treenet.co.nz> wrote:
>
>>On 10.07.2012 07:08, HAYASHI, Tatsuya wrote:
>>> +1
>>>
>>> I know that this is demanded.
>>> When I discussion about http-authentication and phishing,
>>> it is requested by many people.
>>> It is a difficult problem.
>>> ex) proxy...
>>>
>>> I think that it is good to do this discussion now.
>>>
>>> ---
>>> Tatsuya
>>>
>>> On Sat, Jul 7, 2012 at 8:23 AM, James M Snell wrote:
>>>> In general, I'm +1 on the general idea albeit with a few caveats...
>>>>
>>>> 1. To minimize complexity, only a single Content-Integrity header
>>>> should be used. I don't want, as Roy points out, to have to iterated
>>>> through a bunch of unsupported header values looking for the one I
>>>> want. Just as it makes very little sense for an implementor to
>>>> provide
>>>> multiple Last-Modified, Etag and Content-Type headers in a single
>>>> message; there should be only a single Content-Integrity statement
>>>> and
>>>> I either understand it or I don't.
>>
>>Either the client advertises what it supports (opening itself to
>>middleware erasing options they can't modify). Or the server uses
>>multiple algorithms in hopes that the middleware cannot violate them
>>all.
>>It makes perfect sense to have several levels of integrity check. MD5,
>>SHA1, AES in one response and allow the client to validate the strongest
>>it can handle.
>>
>>There is also an arguable case for middleware wanting to add its own
>>hash to inform the client essentially "this is what I got given". So the
>>point of manipulation can be back-traced when the more secure end-to-end
>>checks fail.
>>
>>If you want end-to-end integrity, don't stop at half measures.
>>Particularly at half measures which can be corrupted.
>>
>>
>>>>
>>>> 2. The performance impact of calculating the digest needs to be
>>>> carefully considered. I'd rather not be required to buffer a full
>>>> representation in memory all the time just to calculate a header
>>>> value. I know it's largely unavoidable, but perhaps there's some
>>>> currently elusive solution that can be considered. For instance..
>>>> allowing Content-Integrity to appear as a trailer at the end of a
>>>> chunked response.
>>
>>As has been said Trailers happen here.
>>
>>>>
>>>> 3. Something needs to be said about what happens if the
>>>> Content-Integrity check fails. For instance, if a request containing
>>>> Content-Integrity is sent to the server and the server detects that
>>>> the signature is invalid, what should happen? what must happen?
>>>> Likewise, how are intermediaries expected to treat the
>>>> Content-Integrity header given that any intermediary is able to
>>>> modify
>>>> the payload at any time?
>>
>>This is going to be most useful on request/responses sent with
>>"no-transform" of course.
>>  If the integrity was only a MD5 or SHA1 hash which middleware can edit
>>easily there is no end-to-end integrity, just hop-by-hop integrity.
>>
>>
>>Also, there has to be a mutual secret between origin server and client.
>>Without that, when integrity is compromised the transforming hop will
>>simply erase or replace the Content-Integrity header value. A secret key
>>unknown to that middleware is required to make the integrity hash break
>>when it tries this.
>>
>>AYJ
>>
>
>



-- 
Yutaka OIWA, Ph.D.              Leader, Software Reliability Research Group
                             Research Institute for Secure Systems (RISEC)
   National Institute of Advanced Industrial Science and Technology (AIST)
                     Mail addresses: <y.oiwa@aist.go.jp>, <yutaka@oiwa.jp>
OpenPGP: id[440546B5] fp[7C9F 723A 7559 3246 229D  3139 8677 9BD2 4405 46B5]
Received on Wednesday, 11 July 2012 05:05:39 UTC