Re: "Subresource Integrity" spec up for review.

First, I'd also like to thank Ryan for his review and comments.

Here's a shot at defending the "mixed-content with integrity" use-case. (#6
in Ryan's objections)

First, it's just not true to say that there are no meaningful performance
issues with HTTPS.  Caching is a pretty huge thing in terms of performance
and efficiency.  HTTPS is getting cheaper all the time, but it's still a
really big deal vs. HTTP when you're trying to reach a global audience.
 RTTs and latency matter.  And even if the hardware cost of encryption is
cheap, the cost of building out low-latency endpoints all over the globe
that are physically secure and contractually/legally trustworthy enough to
keep a publicly-trusted certificate with your name on it aren't.

A couple of theses on the tradeoffs involved:

A) HTTPS does not provide any sort of by-design protections against traffic
analysis.  A user navigating around a website with a structure known to an
adversary is actually pretty distinguishable in their activities simply
from things like IP address, navigation timing, resource sizes, etc.
(e.g.
http://blog.ioactive.com/2012/02/ssl-traffic-analysis-on-google-maps.html)
 Some authors will go out of their way to try to make this harder, like
Twitter's padding the size of their avatar images, but this is a goal that
requires very special attention and such authors would surely know better
than to use this proposed mixed-content integrity mechanism.  I think
there's a fair case to be made that for loading things that aren't actually
private (the Google doodle of the day, the standard pictures on the PayPal
home page) over HTTP with integrity is not significantly worse than the
status quo.  (yes, HTTP/2 makes traffic analysis harder to some degree, but
that is still a darn long way from a formal guarantee of
indistinguishability of known plaintext)

B) I think many content authors would say that their explicit goal in
loading resources like images, JS libraries, etc. over HTTPS is integrity
and not privacy.  Some of us on this list may have our own opinions and
goals about trying to raise the cost of pervasive surveillance, but that
isn't everyone's goal for their particular application, or at least not
their most important goal. I think we, as web standards authors, should be
careful in how much we try to dictate these goals to our customers instead
of letting them choose the things they want and need for themselves.

C) Yes, distributed edge-caching over HTTPS is a real thing today, but that
usually involves delegating to some third party the right to impersonate
you. (e.g. put your name as a Subject Alt Name on a certificate they
control)  If we are genuinely worried about state-level attacks against
various parts of end-to-end web security, these third parties look like a
very attractive target for compulsion (especially as Certificate
Transparency gets going).  If a site can keep closer control over its
public authenticity credentials, in fewer jurisdictions and on many fewer
servers, by using caching services in a "trust-but-verify" manner, instead
of today's much more expansive grant of trust, perhaps we have achieved a
substantial improvement after all.

-Brad



On Sat, Jan 11, 2014 at 4:19 AM, Mike West <mkwst@google.com> wrote:

> -security-dev to BCC
> +public-webappsec, Brad Hill
>
> I'm moving this thread to public-webappsec so that folks there can comment
> directly.
>
> On Sat, Jan 11, 2014 at 5:03 AM, Ryan Sleevi <rsleevi@chromium.org> wrote:
>>
>> On Wed, Jan 8, 2014 at 3:29 AM, Mike West <mkwst@chromium.org> wrote:
>>
>>> Hello, lovely friends of Chromium security!
>>>
>>> Frederik, Devdatta, Joel, and I have been working with folks in the
>>> webappsec WG to put together a specification of the ages-old idea of
>>> jamming hashes into an HTML page in order to verify the integrity of
>>> resources that page requests. A strawman draft is up at
>>> http://w3c.github.io/webappsec/specs/subresourceintegrity/ for review.
>>>
>>> Given that some of the proposals are interesting from a security
>>> perspective (in particular, using hashes as cache identifiers, and
>>> potentially relaxing mixed-content checks if the hashes are delivered over
>>> HTTPS), it'd be brilliant to get early feedback so we can make sure the
>>> spec is sane.
>>>
>>
>> I... have a hard time with this proposal and its use cases beyond the
>> first and third.
>>
>
> I think you'll find that #2 is really #1 in disguise.
>
>
>> I apologize that I don't have the bandwidth to jump into the fray and
>> really engage in the W3C group right now, but I can hope you can convey the
>> message.
>>
>
> No worries. I appreciate the feedback, and we'll pick up the conversation
> on the W3C list. Please don't feel obligated to keep up with this thread;
> feel free to mute it.
>
>
>> +1 to 1) Site wants to ensure third-party code doesn't change from what
>> they reviewed. Cool
>>
>
> Good! This is more or less the core of what I want to achieve. Everything
> else is nice to have.
>
>
>> +wtf to 2) Site wants to ensure... code review? How is that an HTML
>> problem? How is it reasonable to induce the CPU costs on millions of users
>> to enforce what is ultimately a procedural problem at the company? How is
>> it in the interest of the users?
>>
>
> The use-case was written unclearly. I've rewritten it in the hopes of
> making it something you'd agree with.
>
> In short: an advertising network like Doubleclick delegates the actual
> delivery of advertising content to third-party servers, and relies on
> contractual obligations (and probably automated checks, etc) to ensure that
> the advertisement delivered is the advertisement that was reviewed. Those
> third-parties sometimes accidentally (or maliciously) deliver altered
> content. By adding integrity metadata to the iframe that wraps an ad, and
> by requiring the ad HTML to contain integrity metadata for subresources, ad
> networks can mitigate this risk.
>
>
>> +0 to 3) I'd say this is where you use HTTPS, especially in light of
>> discussions to 'downgrade' HTTP downloads.
>>
>
> 1. HTTPS gives different integrity promises: it verifies that the server
> you're talking to is the one you're expecting, and gives some protection
> against MITM alterations.
>
> 2. For the same reason that use-case #1 is valuable, even over HTTPS,
> validating download integrity is valuable.
>
>
>> +? to 4) "altered Javascript from the filesystem" is certainly an
>> unrealistic threat that a UA cannot and should not pretend it can defend
>> against, unless that UA is running itself in a higher privilege than the
>> files its accessing - in which case, it should be storing those files
>> securely.
>>
>
> Freddy's original proposal was meant to cover browser UI that loads script
> off the net directly. This would cover, for example, Chrome's NTP.
>
> I've removed the "filesystem" language, as I agree with your criticism.
>
>
>>  +wtfbbqomg to 5) We have had this amazing way of expressing versions
>> for resources since the introduction of HTTP. It's called the URL. If the
>> author wants to express a dependency on a particular version, the amazing
>> power of the web allows them to put a version within a URL and depend on
>> that.
>>
>
> This is probably also poorly worded: I believe the intent was another spin
> on #1. If I load a resource from a server, I'd like to ensure that it
> hasn't been swapped out behind my back. If it has been swapped out, the
> reporting functionality will alert me to the fact, I'll go review the new
> code, and either update the integrity metadata, or rework the mashup to use
> some other resource if I don't like the changes.
>
>
>> +awwhellnaw to 6) "performance reasons" is not and has not been a
>> realistic problem for properly-configured SSL for some time. The example
>> already establishes that the user supports some degree of
>> properly-configured SSL at rockin-resources.com, so there's no reason
>> *not* to use it.
>>
>
> I'm still hopeful that Brad (Hi Brad!) will take some time to give more
> detail around the value proposal for mixed content relaxation and the
> fallback mechanism. I think several of the editors share your opinion here.
>
>
>> The only 'performance reasons' I'm aware of are those ever-insidious
>> transparent, caching proxies. Yes, they're ubiquitous. But if you're trying
>> to solve that problem, you need to come out and say it. "An author wishes
>> to load a resource that a person in a privileged position on the network
>> would prefer to intercept and redirect."
>>
>
> I'll work that in somewhere. I do think creating transparency into that
> manipulation is part of the intent, in much the same way that CSP showed
> folks like Twitter how much code was being injected into their HTML pages,
> integrity verification can make it clear how many resources are changed
> in-flight.
>
>
>> Especially in the nature of our Post-Snowden World, integrity without
>> privacy seems to be setting the bar too low.
>>
>
> The current suggestion is that any HTTP->HTTPS fallback system would omit
> credentials from requests to the HTTP server. I agree that that still opens
> some windows into your browsing activity that might be better left closed.
>
>
>> And integrity with privacy is easily obtained - with HTTPS.
>>
>
> I don't think that's the case, at least, not in the sense of verifying
> that the resource you're loading hasn't been altered on the server. HTTPS
> mitigates the risk of middle-men inbetween you and the server you're
> talking to. It does nothing to verify that the server itself hasn't been
> compromised.
>
> Thanks again for spending some time on this, Ryan. I appreciate it.
>
> -mike
>

Received on Sunday, 12 January 2014 02:00:38 UTC