- From: Mark Nottingham <mnot@mnot.net>
- Date: Mon, 27 Oct 2014 13:43:34 -0700
- To: Brad Hill <hillbrad@gmail.com>
- Cc: Mike West <mkwst@google.com>, WebAppSec WG <public-webappsec@w3.org>
> On 24 Oct 2014, at 10:40 am, Brad Hill <hillbrad@gmail.com> wrote: > > Mike: I think Mark is imaging a header delivered by the parent resource. Yes. I’m pointing out, in a roundabout way, that the use cases for a hash of the contents of a link are likely to be diverse, and that that constraints we put on the first (I won’t call it “primary”) use — integrity of third-party resources — may not make sense for other uses. > Mark: At that point, why not use a manifest > (https://w3c.github.io/manifest/) and find a way to add hashes there? > It seems that what you'd have to deliver over a header is pretty close > to that information set. Yep, something like that would work... > I've been trying to constrain the group's ambitions here (and Mike, > Joel, Dev and Freddy have been good at further constraining my own) on > this sort of thing. I want to see if this approach is at all > interesting to a meaningful set of web publishers, and if it is > manageable by them. Certainly SRI does introduce fragility. > > I wonder if anyone really has the operational capacity to manage a > manifest such as Mark proposes and still meaningfully vet that the > hashes are for authentic content? A publisher could scrape their > subresources and update automatically, but then the protection > devolves to individually targeted attacks on the network, which HTTPS > should address - any malicious changes at the origin server, e.g. if > it was compromised, would be automatically propagated into the > manifest by such tooling. Well, that’s the thing. Content-addressable caching is a very different use case; it’s not about assuring authentic content, it’s an optimisation. In the case where you don’t get a cache hit on CAN, you simply follow the link to the origin and probably *don’t* display an error if the hash still doesn’t match; that’s the most backwards-compatible, sane thing to do. However, that pretty fundamentally disagrees with the first use case for SRI. E.g., an intermediary (e.g., a CDN) could scrape content and create manifests to enable CAN for referenced content; this would be of significant value for performance, but of course would have zero value for end-to-end integrity. This makes me wonder whether CAN is really something separate from SRI (as much as I’d like to see it get some traction). Cheers, > So I think as a first step, it's best to address the easiest and > highest possible value cases first - large JS libraries, commonly > downloaded over CDNs, for which most resource authors already want > stable versioning independent of any security threats. It's > serendipitous that this also happens to be use case where > content-addressable-caching could also provide a large benefit. > > If we can make that work without horrible security and privacy > side-effects, and people use it and like it and it doesn't make the > web horribly brittle, then we can take the next baby steps. > > The directions of those baby steps also can be guided by at least > three major motivations, which we should probably discuss as part of > our rechartering effort: > > 1) Decrease the performance and other costs associated with delivering > an all-secure web. TLS is very cheap, but caching is still a big > deal, especially for people in remote locations living on very modest > means. There are over 5 billion people with no internet connectivity > at all today, and these costs are meaningful to them. > > 2) Allow specification of applications build with web technologies and > possibly delivered over the web that are more concrete and verifiable, > perhaps with the intent of being able to grant more sensitive > permissions to such applications. I think that the SysApps group and > work on app manifests that I pointed to above is important to consider > for any such efforts, and perhaps we should cultivate more formal > coordination on this front. > > 3) Reduce single points of failure for security on the web. This has > always been my main motivation. How do we make it so that compromise > of a single web server providing script libraries, analytics, sign-in, > social widgets, or the like doesn't automatically transitively > compromise the web applications of millions of sites that include > script from those servers? Again, next-steps here don't necessarily > entail adding more to SRI, but maybe providing better and less fragile > privilege separation mechanisms for script. (maybe better secure > modularization in JS itself, or maybe pulling two scripts - an > SRI-tagged interface layer that goes directly in your environment, and > a implementation that gets forced into something like a cross-origin > sandboxed worker.) > > -Brad > > On Fri, Oct 24, 2014 at 3:00 AM, Mike West <mkwst@google.com> wrote: >> The security improvement we get from integrity checks comes from the fact >> that the digest is delivered out-of-band with the resource. If jQuery's >> server is compromised, it's only the sloppiest of attackers who would update >> the resource, but not the headers. >> >> It's not clear to me what benefit we'd obtain from a response header that >> contained information that could be easily calculated from the resource >> itself. Could you explain the use-case a little bit? >> >> -mike >> >> -- >> Mike West <mkwst@google.com> >> Google+: https://mkw.st/+, Twitter: @mikewest, Cell: +49 162 10 255 91 >> >> Google Germany GmbH, Dienerstrasse 12, 80331 München, Germany >> Registergericht und -nummer: Hamburg, HRB 86891 >> Sitz der Gesellschaft: Hamburg >> Geschäftsführer: Graham Law, Christine Elizabeth Flores >> (Sorry; I'm legally required to add this exciting detail to emails. Bleh.) >> >> On Fri, Oct 24, 2014 at 5:47 AM, Mark Nottingham <mnot@mnot.net> wrote: >>> >>> Has there been any discussion of how the integrity information is >>> associated with a resource? >>> >>> I think using the integrity attribute on the link makes sense for the most >>> current use case -- assuring that off-site content (e.g., on a CDN) is what >>> you think it's going to be. That's because in these cases, the URL is most >>> likely to be a version-specific one (<e.g., >>> https://cdn.com/foolib.1.2.3.js>), so if the author wants to update the >>> library version used, they'll need to update the link, and the integrity >>> information is right next to it. >>> >>> However, in the cache reuse case -- which seems to be getting *some* >>> traction (or at least consideration) -- next to the link is about the worst >>> place the integrity information can go; if the author updates the library, >>> they'll need to update each and every instance of a link to it, which can be >>> quite onerous. >>> >>> In that use case, it makes more sense to put integrity information into >>> HTTP headers or even a separate resource, so that it can more easily be >>> updated (e.g., by a separate process, or automatically by the server at >>> response time). >>> >>> So, I'm wondering if the WG would consider allowing integrity information >>> to be carried in HTTP response headers (e.g., Link), at least for the cache >>> reuse case. >>> >>> Cheers, >>> >>> -- >>> Mark Nottingham https://www.mnot.net/ >>> >>> >> -- Mark Nottingham http://www.mnot.net/
Received on Monday, 27 October 2014 20:43:59 UTC