Re: Content-Integrity header

On Fri, Jul 6, 2012 at 4:02 PM, Roy T. Fielding <fielding@gbiv.com> wrote:
> On Jul 6, 2012, at 11:43 AM, Phillip Hallam-Baker wrote:
>
>> HTTP 1.0 specifies a header Content-MD5.
>
> Actually, HTTP/1.1 (draft-ietf-http-v11-spec-02.txt, April 1995).
>
>> This is a bad header for all sorts of reasons, not least that the
>> header was added despite the fact that I told people that MD5 had just
>> been severely compromised.
>
> Compromised in 1995?  I don't think so, and certainly not in a way
> that would have effected a MIC for *accidental* modification
> (which is all it was intended to be).

Yes, Dobertin circulated his paper showing an initial collision result
in private before the header was put in the spec. I remember arguing
against the header before it was put in the spec knowing that it was
broken.

>  It was never intended for
> authentication because the field can be altered by an attacker
> just as easily as the content.

It is not intended for authentication but the ability to create
collisions does have some very significant security consequences.

>> In particular there is only one digest
>> algorithm specified and no way to use others.
>
> Of course there is another way -- just define another field for
> that algorithm.  Defining a new content-sha256 (or whatever) field
> is just as effective (and efficient) as trying to put all integrity
> checks into a single field with an alg parameter.  Moreover, the
> new field definition could be used without worrying about all of
> the broken implementations of poorly defined integrity checks
> like content-md5.
>
>> A better approach would be:
>>
>> Content-Integrity: <base64-value> ;alg=<ID>
>
> That is another approach.  It means that a recipient that only
> understands alg A has to parse each content-integrity field-value
> to look for a matching A parameter, as opposed to simply checking
> for the presence of a content-A field.  It has benefit if we expect
> integrity checks to be performed by a multi-algo processor like
> openssl, but is a nuisance if we expect only one or two algorithms
> to be in common use at the same time.

If you are using something like IIS where the header values are mapped
to class members or attributes, adding a new header is a big deal,
adding a parameter value is not.

>> For many Web Services, what we would like to do is to specify a MAC
>> rather than a digest and to use a preshared key identified by some
>> form of kerberos-like ticket. We don't need to specify the internal
>> structure of the ticket, just accept that we will have some sort of
>> key exchange service interaction at the end of which the client ends
>> up with a shared secret, an algorithm and an identifier that can be
>> passed to the service where the interaction takes place:
>>
>> Content-Integrity: <base64-value> ;ticket=<ticket-data>
>>
>> Note that the method of establishing that ticket in the first place is
>> out of scope for HTTP. Just think of it as something akin to a cookie.
>
> Why implement a MAC using the same field as a MIC?
> Digital signatures are much harder to implement.

This is not a Digital Signature. In fact repudiation is quite often an
undesirable property.

> Why don't the existing digital signature mechanisms suffice?
> I know they haven't been deployed well for HTTP, but I don't know
> why other than the expired patents and chicken/egg problems.

Because they are digital signature mechanisms for a start. And that
means they are tied to things like PKI that are really application
layer not platform layer.

> Why do you think a generic integrity check processor would want
> to implement anything that complex?

MACs are not complex, see Kerberos.

> I agree that some common Dsig standard is needed, and that a good
> way to do that is to simply assume that the shared key is already
> known via other means, but I don't see any need to tie that to a
> generic integrity check field when two different fields would be
> easier to define, argue, and implement.

One of the crossover cases is where a MAC is used with a specified
key. This has some interesting properties as it allows a proof of
knowledge of a content item to be established.

For example, say we have a HEAD request that specifies a key to be
used to calculate a MAC value. We have a lightweight means of seeing
if the service actually stores the content it claims.


> In any case, the hard part is deciding what content is included
> in the hash.  Sometimes folks want to include the whole representation
> (as it exists at the source), others want to include only the payload
> of the current message, and still others want the payload and some
> odd set of header fields.  And then we start talking about canonical
> forms, and pretty soon we are just sick of arguing about it and ship
> some gunk that doesn't actually solve a real problem.

No, the content-integrity header would only cover the content and
nothing else. That is why it is Content-Integrity and not
'Content-and-random-headers-Integrity.

> Trying to do everything at once is a recipe for madness.  Just pick
> a very specific problem to solve, solve it the most efficient way
> for that problem, and use a field name specific to the solution.
> If it turns out to be useful beyond that, that's gravy.

Then we end up with the detritus of everyone's legacy one-off
solutions like Content-MD5.


-- 
Website: http://hallambaker.com/

Received on Friday, 6 July 2012 21:16:38 UTC