- From: Joel Weinberger <jww@chromium.org>
- Date: Tue, 14 Jan 2014 14:51:51 -0800
- To: Adam Langley <agl@google.com>
- Cc: "public-webappsec@w3.org" <public-webappsec@w3.org>
- Message-ID: <CAHQV2KmUZOgFr9MP7-bwYPVMStNiJi_K9b=0_vAUmjWiF3u+Hw@mail.gmail.com>
This is a great point. Are there any cases in CSP currently where we have a syntax for describing Merkle trees, or would we need to come up with one? I think it definitely makes sense to support such structures. --Joel On Tue, Jan 14, 2014 at 12:08 PM, Adam Langley <agl@google.com> wrote: > Current examples seem to be using a single hash to authenticate a > whole resource. However, that requires that the whole resource be > buffered before any of it can be used. This extra latency might well > outweigh any performance benefits that one might wish to gain by using > integrity. > > This problem isn't fundamental. If we split the resource into chunks > then we have long known how to authenticate a number of objects with a > single hash using a Merkle tree. Consider a binary tree where the > leaves are hashes of chunks and the interior nodes are hashes of their > children. (For details, see > http://tools.ietf.org/html/rfc6962#section-2.1 or many other sources.) > > Given a Merkle tree, each chunk in the resource can be proceeded by a > logarithmic number of hashes which give all the missing nodes up the > tree to the root and, by second-preimage resistance, a single hash in > the source document can authenticate any chunk-aligned subset of the > resource. > > The chunk size is the amount of data that must be buffered before > being processed. A smaller chunk means lower latency, but more > overhead. > > For resources that aren't accessed with Range requests, we can > simplify things a little by assuming that it'll always be received in > order. > > Rather than a balanced binary tree for the Merkle tree, the tree can > be made completely unbalanced. Specifically, the left child of every > node is a data chunk and the right node (if any) is an interior node. > This means that the root hash covers the first chunk and the next > interior node. If the second chunk is preceded by this interior node, > and so on, then the whole file is secured with only a single hash per > chunk. > > Both of the above require that the resource data itself be altered to > add extra data. This means that a resource suitable for integrity > cannot be used without it and vice versa. If this is unacceptable in > some cases then it's very easy to put a number of hashes straight into > the HTML: all the interior nodes of a Merkle tree could be given. The > downside is that a large amount of hash data might delay loading of > the remainder of the HTML. > > In short, this means that the attribute should be something like: > > integrity="8192;unbalanced;progressive;sha-256;6b4j122..." > > I.e. it contains the chunk size, the tree shape, whether the hashes > are inline in the HTML or loaded progressively in the resource data, > the hash function and the data itself. (I'm not trying to propose a > concrete syntax.) > > > Cheers > > AGL > >
Received on Tuesday, 14 January 2014 22:52:19 UTC