Re: "Subresource Integrity" spec up for review.

On Tue, Jan 14, 2014 at 10:27 AM, Joel Weinberger <jww@chromium.org> wrote:

>
>
>
> On Tue, Jan 14, 2014 at 7:55 AM, Brad Hill <hillbrad@gmail.com> wrote:
>
>>
>>
>> On Mon, Jan 13, 2014 at 12:46 PM, Ryan Sleevi <rsleevi@chromium.org>wrote:
>>>
>>>  ...
>>>>
>>> I think it's a bit of a stretch to suggest that because traffic analysis
>>> exists as a possibility that HTTPS provides limited-to-no-privacy,  ...
>>>
>>>
>> That's a much stronger claim than I'm making. I'm only suggesting that
>> some resource loads aren't very privacy-sensitive to begin with, and
>> probably can be observed or inferred anyway over HTTPS, so I think there is
>> limited or no harm in performing them with only integrity protections.
>>  Granted, of course, we can't enforce in spec language or code that only
>> such resources would be used with this technology, but privacy is always a
>> matter of trusting your counterparty to be responsible.
>>
>>
>>>
>>>> ...
>>>>
>>> As browser vendors, we have an obligation to our users to ensure that
>>> their security is preserved, and, whenever both possible and reasonable,
>>> that their *expectations* of security is preserved.
>>>
>>> Today, there is a simple duality of the web. Either you're browsing in
>>> HTTP - in which there is absolutely no security whatsoever - or you're
>>> browsing with HTTPS, which provides a combination of assertions about
>>> identity (namely, domain ownership), privacy, and integrity.
>>>
>>> If a user visits https://site.example and it loads sub-resources over
>>> HTTP with integrity protection - which is, at it's core, the crux of #6 -
>>> what would or should the UI indicate. Is it reasonable to show the famed
>>> 'lock' icon in this case - even when the traffic is visible to an
>>> attacker/observer? Does that align with users expectations? I don't think
>>> it does.
>>>
>>
>> I think this is a very good question indeed.  I appreciate the effort to
>> make clear security statements to the user, and the lock, while mysterious
>> in its inner workings, is about all we have right now.  I agree that it is
>> a bad idea to devalue it, and it could risk trust in the web overall.
>>
>> Along these lines, I wonder about the integrity cache idea.  What's the
>> effective difference between allowing an HTTPS resource with the lock to be
>> composed from pieces that might've been fetched as part of a different
>> (secure or not) resource or delivered with an app, versus doing an
>> immediate fetch-with-integrity over insecure channels?  What are the actual
>> essential properties we're trying to communicate to the user with the lock,
>> and what violates them?  Just something to think about and discuss further,
>> since I like the integrity cache idea even more than I like the
>> mixed-content with integrity idea.
>>
>>
>>> You can always refer to edge-cache controlled names within your resource
>>> loading URLs. If, for various reasons (eg: SOP, CORS, etc), then you can
>>> always delegate a sub-domain, as many organizations are already doing.
>>>
>>
>> Good point.
>>
>>>
>>> If your threat model is state level attackers and/or legal compulsion,
>>> you can *still* use the integrity protected sub-resources - but deliver
>>> those resources over HTTPS. HTTPS avoids the mixed content, and provides
>>> real and meaningful integrity protection (eg: without worrying about the
>>> hash collisions implications of vastly unstructured data like JS), and then
>>> this use case just fits into the #1/#2.
>>>
>>> I still think that the integrity attribute is useful here, even if we
>> assume HTTPS, because the distributed nature of a CDN puts so many more
>> entities in a position of privilege, and if you're loading script,
>> importing HTML or even loading images, it's still your origin at the end of
>> the day from the user's perspective.
>>
> +1 to Brad's point here. From my perspective, this is, in fact, probably
> the most important part of the integrity spec. Without integrity,
> regardless of HTTPS or not, many websites are instilling trust in CDNs that
> is simple unnecessary. It provides many more attack vectors, and there
> isn't a reason that should be. With the integrity check, it allows the
> original server to be the authoritative source of content. CDNs are reduced
> to what they originally were meant to be: content distribution only, with
> no authority. This seems extraordinarily useful to me, with HTTP or HTTPS.
>
>
Right,

Just in case it was not clear, I'm 100% on board with the integrity spec as
a way of dealing with untrusted hosters who may serve different content
than intended. I think the decentralized way of serving resources really
does need a way of the embedder to be able to specify policy - whether it
be simple integrity or more complex security policy (as proposed by
http://www.secure-links.org/ ).

That said, my concerns with the use case is simply that I have trouble with
a use case positioning it as an alternative/replacement for HTTPS, or to
allow mixed-content within an HTTPS context, provided that it's integrity
protected. I think that's a shakier use case, with ramifications on UI and
user expectations, but also with processing model.

Example: Consider if sub-resource integrity was handled via MD5 - and
algorithm with known collisions that can be computed. In the "untrusted
hoster" scenario, it's the embedder who can and should make the security
decision about the integrity of links, and the worst that would happen when
a hash breaks (such as MD5) is that it falls back to "the normal web" of no
sub-resource integrity. There are no implications to browser processing.

Now consider if sub-resources were allowed to be mixed. First and foremost,
you'd still need to support HTTPS (for all "downlevel"/non subresource
integrity aware) browsers. When the chosen hash (eg: MD5) is broken, a UA
will disable that hash from being acceptable, and then all updated users of
those browsers find themselves fetching over HTTPS again (because the HTTP
version is unacceptable). This effectively means that, as a site operator,
you're always required to handle the capacity of the thundering herd
hitting your HTTPS deployments, and HTTP is just a "nice to have".

However, the reality is that sites will no doubt come to rely on UAs
preferring the HTTP over HTTPS, and fail to plan capacity of HTTPS. This
then makes it harder for UAs to disable the hash, because if the sites
failed to plan capacity, disabling the hash may make the UA appear slower
for users - because the HTTPS is under-provisioned. It then becomes this
weird and tricky game of trying to find out what security guarantees you
can reasonably make about the connection, and how do you communicate that
to the user? Slowing down sites users go to is very much a "regression" -
yet allowing insecure content through is equally untenable from a security
perspective.

Having dealt with deprecating crypto on the SSL and PKI side - including
MD5 and RSA keys < 1024-bits, I'm painfully aware that it's incredibly
tricky to balance the usability/performance concerns (eg: 20% of the top
10,000 sites break) with the security concerns, and I see the mixed-content
use case as only amplifying this confusion and hand-wringing from UAs. I
don't think that some of the tricks we use for SSL/TLS "weirdness" (eg:
internal server names getting a red URL bar, but no interstitial) will work
for sub-resource integrity, and I think foisting more security decisions on
users would be a bad thing. That's why I have trouble seeing how it
would/could work to the benefit of users (and their security)

Received on Tuesday, 14 January 2014 18:52:53 UTC