- From: Mike West <mkwst@google.com>
- Date: Wed, 15 Jan 2014 10:16:28 +0100
- To: Adam Langley <agl@google.com>
- Cc: "public-webappsec@w3.org" <public-webappsec@w3.org>
- Message-ID: <CAKXHy=f0QUBzgfiQwjVBAeZQ1ty9NxhX9uF0a8yp9cCAz0_Ftg@mail.gmail.com>
Adam, you hurt my brain. I need to go read up on Merkle trees. :) On Tue, Jan 14, 2014 at 9:08 PM, Adam Langley <agl@google.com> wrote: > Current examples seem to be using a single hash to authenticate a > whole resource. However, that requires that the whole resource be > buffered before any of it can be used. This extra latency might well > outweigh any performance benefits that one might wish to gain by using > integrity. > 1. Performance isn't the goal. Integrity is the goal. 2. I think the performance benefits of integrity would be focused on cache. That is, the second load of a resource, regardless of its URL, could avoid hitting the network entirely if we already have a matching resource locally. For this case, we have the whole resource already, by definition. That said, it would be wonderful to avoid some of the obvious performance hits that result from verifying a resource only when the entire resource has been downloaded. This approach could, for instance, allow us to kick out of a download early if we can detect a hash mismatch in the middle of a file rather than at the end, or to start parsing an HTML document in an IFrame. > Both of the above require that the resource data itself be altered to > add extra data. This means that a resource suitable for integrity > cannot be used without it and vice versa. I think this is problematic in most (all?) cases, given the nature of the threat we're attempting to address. Trusting the resource to authenticate itself doesn't provide much benefit if we're not sure we can trust the resource in the first place. That said, if I've understood you correctly, we could put only the initial hash into the HTML document, and subsequent hashes into the resource? I'm not sure what that would look like on disk, or how we would best be able to communicate the hashes alongside the resource stream, but it's well worth considering as an alternative to one-hash-one-resource. If this is unacceptable in > some cases then it's very easy to put a number of hashes straight into > the HTML: all the interior nodes of a Merkle tree could be given. The > downside is that a large amount of hash data might delay loading of > the remainder of the HTML. > It would be interesting to evaluate how much overhead this would produce in the worst case. It sounds significant (a few percent, depending on block size and digest size). -mike
Received on Wednesday, 15 January 2014 09:17:16 UTC