Re: [integrity]: latency tradeoffs from Mike West on 2014-01-15 (public-webappsec@w3.org from January 2014)

From: Mike West <mkwst@google.com>
Date: Wed, 15 Jan 2014 10:16:49 +0100
To: Joel Weinberger <jww@chromium.org>
Cc: Adam Langley <agl@google.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CAKXHy=f4c1v6ccyVDUsp_CNTP2CQ62O_EytyXshsq20=tojE1A@mail.gmail.com>

We have no such syntax. We'd need to come up with something.

-mike

--
Mike West <mkwst@google.com>
Google+: https://mkw.st/+, Twitter: @mikewest, Cell: +49 162 10 255 91

Google Germany GmbH, Dienerstrasse 12, 80331 München, Germany
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Graham Law, Christine Elizabeth Flores
(Sorry; I'm legally required to add this exciting detail to emails. Bleh.)


On Tue, Jan 14, 2014 at 11:51 PM, Joel Weinberger <jww@chromium.org> wrote:

> This is a great point. Are there any cases in CSP currently where we have
> a syntax for describing Merkle trees, or would we need to come up with one?
> I think it definitely makes sense to support such structures.
> --Joel
>
>
> On Tue, Jan 14, 2014 at 12:08 PM, Adam Langley <agl@google.com> wrote:
>
>> Current examples seem to be using a single hash to authenticate a
>> whole resource. However, that requires that the whole resource be
>> buffered before any of it can be used. This extra latency might well
>> outweigh any performance benefits that one might wish to gain by using
>> integrity.
>>
>> This problem isn't fundamental. If we split the resource into chunks
>> then we have long known how to authenticate a number of objects with a
>> single hash using a Merkle tree. Consider a binary tree where the
>> leaves are hashes of chunks and the interior nodes are hashes of their
>> children. (For details, see
>> http://tools.ietf.org/html/rfc6962#section-2.1 or many other sources.)
>>
>> Given a Merkle tree, each chunk in the resource can be proceeded by a
>> logarithmic number of hashes which give all the missing nodes up the
>> tree to the root and, by second-preimage resistance, a single hash in
>> the source document can authenticate any chunk-aligned subset of the
>> resource.
>>
>> The chunk size is the amount of data that must be buffered before
>> being processed. A smaller chunk means lower latency, but more
>> overhead.
>>
>> For resources that aren't accessed with Range requests, we can
>> simplify things a little by assuming that it'll always be received in
>> order.
>>
>> Rather than a balanced binary tree for the Merkle tree, the tree can
>> be made completely unbalanced. Specifically, the left child of every
>> node is a data chunk and the right node (if any) is an interior node.
>> This means that the root hash covers the first chunk and the next
>> interior node. If the second chunk is preceded by this interior node,
>> and so on, then the whole file is secured with only a single hash per
>> chunk.
>>
>> Both of the above require that the resource data itself be altered to
>> add extra data. This means that a resource suitable for integrity
>> cannot be used without it and vice versa. If this is unacceptable in
>> some cases then it's very easy to put a number of hashes straight into
>> the HTML: all the interior nodes of a Merkle tree could be given. The
>> downside is that a large amount of hash data might delay loading of
>> the remainder of the HTML.
>>
>> In short, this means that the attribute should be something like:
>>
>> integrity="8192;unbalanced;progressive;sha-256;6b4j122..."
>>
>> I.e. it contains the chunk size, the tree shape, whether the hashes
>> are inline in the HTML or loaded progressively in the resource data,
>> the hash function and the data itself. (I'm not trying to propose a
>> concrete syntax.)
>>
>>
>> Cheers
>>
>> AGL
>>
>>
>

Received on Wednesday, 15 January 2014 09:17:38 UTC