W3C home > Mailing lists > Public > ietf-http-wg@w3.org > July to September 2012

Re: Content-Integrity header

From: Phillip Hallam-Baker <hallam@gmail.com>
Date: Wed, 11 Jul 2012 09:47:28 -0400
Message-ID: <CAMm+Lwh5q2UPX_WTzbB=me-R0PswvvPwZfG6rUfummMMkR09fA@mail.gmail.com>
To: Yutaka OIWA <y.oiwa@aist.go.jp>
Cc: "Ludin, Stephen" <sludin@akamai.com>, Amos Jeffries <squid3@treenet.co.nz>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
On Wed, Jul 11, 2012 at 1:04 AM, Yutaka OIWA <y.oiwa@aist.go.jp> wrote:
> Another reason to provide per-chunk integrity is to allow
> clients to progressively use the content while receiving.
> Progressive handling of large objects is common in
> current Web systems, especially for large resources
> like texts (HTMLs, PDFs), images, movies, and others.
> Without HTTP-level integrity protection, it works well.
> (TLS provides checking integrity of partial stream.)

+1 the only issue would be that it would be necessary to specify how
the integrity check carried across from chunk to chunk and there might
be a replay attack issue unless there was some way of introducing an
IV or equivalent.

Which I can add...

We have missed the -00 cut off date so I can't write this up as a
separate draft but I can add it into my omnibroker draft. I will move
the integrity header stuff out to an appendix.


> However, if integrity protection is provided by
> either a trailer or a header, clients must firstly store all
> received data without further processing, check integrity,
> and then decide whether to accept.

That is certainly a reasonable issue for the browser display use case.


> To this purpose, chunk-based integrity protection is
> even useful for static contents for which hash values can be
> pre-computed.
>
> "Re-inventing the wheel" issue is still alive :-)

I don't think so.

At the moment we have people busy re-inventing this particular wheel
as content-payload. We have XML Signature, and we have WS-*.

Those have all become big and heavy and a big part of the reason for
that is that they are re-inventing features that properly belong in
the presentation layer (i.e. http in this case).

This is not a signature header, it is an integrity check. There is a
very important distinction. The only reason to use signature over a
MAC is to get non-repudiation. But that means nothing unless you save
the bits that were signed.


> P.S.
> As a personal feeling, the value of integrity protection on
> this moment is more on protection against intentional content-forging
> attacks, rather than unintentional communication failure
> (except premature termination of streams).
> To this extent, re-requesting the broken chunk is personally out of
> my interest.  This may need discussion, because it may affect
> the design of inter-chunk chaining of integrity signatures.


I would see an integrity fault as being the same as a broken
connection. HTTP does have some consideration for that problem already
(range request). If any more mechanism was desired, people can ad it
separately. But it is a totally different issue.


> 2012/7/11 Ludin, Stephen <sludin@akamai.com>:
>> I really like the idea of placing the Digest in the chunk trailers.  Being
>> able to calculate these digests on the fly and not buffer the entire
>> message is critical in my opinion.
>>
>> Another concept that I have been playing with is providing digests on
>> individual chunks using chunk-extension.  The rational for this is for
>> very large objects.  With per-chunk digests the client would have the
>> ability to re-request a specific corrupted section of an object using a
>> range request rather than the entire object.  This can have enormous
>> perceived performance and reliability benefits for consumers of things
>> such as software download and large media files.
>>
>> I was working on a draft to propose this, but I didn't feel it was well
>> baked enough to share.  If there is interest in this type of functionality
>> I will polish it up and post it.
>>
>> One issue to point is is that for these types of "frame" based integrity
>> checks I generally feel like we are reinventing the content integrity
>> portion of SSL/TLS.  Though I see the value in begin able to do this apart
>> from SSL it forces the question at what point do you just switch over to
>> SSL to get the desired functionality?
>>
>> -stephen
>>
>>
>>
>> On 7/9/12 4:00 PM, "Amos Jeffries" <squid3@treenet.co.nz> wrote:
>>
>>>On 10.07.2012 07:08, HAYASHI, Tatsuya wrote:
>>>> +1
>>>>
>>>> I know that this is demanded.
>>>> When I discussion about http-authentication and phishing,
>>>> it is requested by many people.
>>>> It is a difficult problem.
>>>> ex) proxy...
>>>>
>>>> I think that it is good to do this discussion now.
>>>>
>>>> ---
>>>> Tatsuya
>>>>
>>>> On Sat, Jul 7, 2012 at 8:23 AM, James M Snell wrote:
>>>>> In general, I'm +1 on the general idea albeit with a few caveats...
>>>>>
>>>>> 1. To minimize complexity, only a single Content-Integrity header
>>>>> should be used. I don't want, as Roy points out, to have to iterated
>>>>> through a bunch of unsupported header values looking for the one I
>>>>> want. Just as it makes very little sense for an implementor to
>>>>> provide
>>>>> multiple Last-Modified, Etag and Content-Type headers in a single
>>>>> message; there should be only a single Content-Integrity statement
>>>>> and
>>>>> I either understand it or I don't.
>>>
>>>Either the client advertises what it supports (opening itself to
>>>middleware erasing options they can't modify). Or the server uses
>>>multiple algorithms in hopes that the middleware cannot violate them
>>>all.
>>>It makes perfect sense to have several levels of integrity check. MD5,
>>>SHA1, AES in one response and allow the client to validate the strongest
>>>it can handle.
>>>
>>>There is also an arguable case for middleware wanting to add its own
>>>hash to inform the client essentially "this is what I got given". So the
>>>point of manipulation can be back-traced when the more secure end-to-end
>>>checks fail.
>>>
>>>If you want end-to-end integrity, don't stop at half measures.
>>>Particularly at half measures which can be corrupted.
>>>
>>>
>>>>>
>>>>> 2. The performance impact of calculating the digest needs to be
>>>>> carefully considered. I'd rather not be required to buffer a full
>>>>> representation in memory all the time just to calculate a header
>>>>> value. I know it's largely unavoidable, but perhaps there's some
>>>>> currently elusive solution that can be considered. For instance..
>>>>> allowing Content-Integrity to appear as a trailer at the end of a
>>>>> chunked response.
>>>
>>>As has been said Trailers happen here.
>>>
>>>>>
>>>>> 3. Something needs to be said about what happens if the
>>>>> Content-Integrity check fails. For instance, if a request containing
>>>>> Content-Integrity is sent to the server and the server detects that
>>>>> the signature is invalid, what should happen? what must happen?
>>>>> Likewise, how are intermediaries expected to treat the
>>>>> Content-Integrity header given that any intermediary is able to
>>>>> modify
>>>>> the payload at any time?
>>>
>>>This is going to be most useful on request/responses sent with
>>>"no-transform" of course.
>>>  If the integrity was only a MD5 or SHA1 hash which middleware can edit
>>>easily there is no end-to-end integrity, just hop-by-hop integrity.
>>>
>>>
>>>Also, there has to be a mutual secret between origin server and client.
>>>Without that, when integrity is compromised the transforming hop will
>>>simply erase or replace the Content-Integrity header value. A secret key
>>>unknown to that middleware is required to make the integrity hash break
>>>when it tries this.
>>>
>>>AYJ
>>>
>>
>>
>
>
>
> --
> Yutaka OIWA, Ph.D.              Leader, Software Reliability Research Group
>                              Research Institute for Secure Systems (RISEC)
>    National Institute of Advanced Industrial Science and Technology (AIST)
>                      Mail addresses: <y.oiwa@aist.go.jp>, <yutaka@oiwa.jp>
> OpenPGP: id[440546B5] fp[7C9F 723A 7559 3246 229D  3139 8677 9BD2 4405 46B5]
>



-- 
Website: http://hallambaker.com/
Received on Wednesday, 11 July 2012 13:48:02 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 11 July 2012 13:48:09 GMT