Re: Verified Javascript: Proposal from Daniel Huigens on 2017-04-26 (public-webappsec@w3.org from April 2017)

From: Daniel Huigens <d.huigens@gmail.com>
Date: Wed, 26 Apr 2017 19:22:54 +0200
To: Brad Hill <hillbrad@gmail.com>
Cc: Jeffrey Yasskin <jyasskin@google.com>, public-webappsec <public-webappsec@w3.org>
Message-ID: <CAL14OeESLNEGhEOyWLB9Hs6Q2mX=5H3Hbua1SjCkVLEUwQEs-Q@mail.gmail.com>
2017-04-26 5:14 GMT+02:00 Brad Hill <hillbrad@gmail.com>:
> What security guarantees are you trying to make with this proposal?

I'm trying to make it possible to verify that everyone is running the
same code, and to make it possible to sign up to be notified if that
code changes.

> While there are infinitely many ways to make an application secure, there
> are also infinitely many ways to make it insecure in your proposal, and
> trivially violate the basic premises of transparency, namely that I can
> discover the entire codebase which is covered by the proof, in order to
> audit it, compare it against a trusted audit, and detect if I am being
> partitioned from other users.

I agree.

> Without origin isolation, it is actually impossible to prove that your
> application instance hasn't been tampered with by non-transparent code in
> the unit of related browsing contexts.

I'm still not sure what origin isolation you think I'm missing. Yes,
with HCS you should also put your web app on a dedicated (sub-)domain.

> By creating an isolated origin and mandating that all code loading paths are
> transitively covered by the transparency statement, these tasks are made
> substantially easier.

I agree, sort of. I think in my example of mega.nz it's still fairly
simple to check that all resources are signed (since there's one file
that manages that) relative to the fairly large task of reading and
auditing all the code of mega.nz.

One difference with your proposal is that I'm not trying to make any
UI. Because, one of the advantages of actually having binary
transparency is not having to let individuals worry about whether the
app is still secure. I'll rather leave it to the web app to make
claims about its security, backed up by an audit. Which is why, in a
sense, restricting the code seems silly to me: the audit is supposed
to check the whole web app, including whether or not it has any escape
hatches from binary transparency (which are just security holes). In
most cases, that will be an easy part of an audit (for example,
checking that the web app sets an appropriate CSP).

For 99% of users, the subtleties of "the web app has updated, but we
(the browser) haven't actually checked that it's still secure, so good
luck with doing that yourself" is useless and confusing, and does not
warrant making a dedicated UI for. For the 1% that understand and
care, they can sign up for notifications from the public log. And keep
in mind, to provide security to 100% of users, notifying 1% is enough:
if they detect a security hole, they can notify the public through
other means, just like they would have to do with the dedicated UI in
place.

> All an audit must do is demonstrate confidence that
> there exists no weird machine or malicious interpretation of data inputs
> that produces undesirable behavior, such as a key disclosure.  This is a far
> simpler task.

I disagree. I hear crypto code is rather easy to get wrong, and rather
expensive to audit.

> In a subset of JS that doesn't include eval it might be
> accomplished by using static analysis to prove the properties of a finite
> state machine, such that each release of a reasonably structured application
> might have a proof of security automatically produced for it by a trusted
> third-party tool, and published in a manner that is discoverable via the
> transparency record.

Such a tool doesn't exist, and it would be a giant undertaking to
create one, if it's even possible. I think we shouldn't mix attempting
to prove that the code we're running is the same as the code that was
audited, and auditing the code. They're just very different, although
complementary, goals, and there already exists an industry for the
latt

> (origin isolation also naturally covers workers / service workers, btw - in
> fact one could even assume a malicious service worker and prove its
> impotence in many applications)
>
> I also still fail to see what is actually accomplished by binding the
> transparency proof to the TLS certificate, except a very awkward
> piggybacking on existing CT logs.

Well. I think we need public logs. Of course, we can try to set up new
logs, but there will be only hundreds or at most thousands of
applications using it, compared to the millions of applications using
CT. Who's gonna run those logs? Who's gonna write tools for it?

More importantly, how do you authenticate to push to those logs? If
the criterion is "do you have access to the web app's server", then we
haven't increased security (since that's also the criterion for
updating the code today). So we need some kind of key pair. And also
domain validation, for the initial binding of the key to the domain.
And what if you lose your key? Etc. Suddenly this starts to sound
really complicated, and awfully similar to the system we already have,
which is perfectly fit for this purpose. Even EV certificates add
value here: with those, you can only update your web app if you can
legally prove to be the owner, and the process can be done in hours.
Now, for some web apps that might be unacceptably long, but for others
it might be the level of guarantee they want. I think it's pretty cool
that it would be possible, and we should make use of the good parts of
what we have, and not duplicate everything.

> Almost regardless of what improvements
> you assume in CA issuance tooling, adding a necessity of exact
> synchronization of certificate and code deployment,

You can include multiple hashes for one resource, so this is not the case.

> and putting a CA on the
> critical path for all code updates,

If necessary, a web app could request a certificate (with or without
HCS) from another CA. Not being able to update easily if a CA is down
is a similar risk to e.g. Heroku being down, and a small price to pay
for binary transparency.

> are just non-starters in many
> environments.

HCS is not meant for "many environments". Just the ones that say
something about privacy or client-side encryption or client-side
processing on the tin. Keep in mind, most of those web applications
don't actually exist yet, because client-side encryption on the web
doesn't add any security value today. So those new web applications
can set up their workflow with HCS in mind.

> It is a layering violation with enormous costs at scale.

Like I said in another message, Let's Encrypt is issuing some 4
million certificates per week. I think CT logs and CA's are more than
capable of handling a few thousand per day extra.

-- Daniel Huigens

> -Brad
>
>
>
> On Tue, Apr 25, 2017 at 6:39 PM Daniel Huigens <d.huigens@gmail.com> wrote:
>>
>> Hi Brad,
>>
>> > It does not address how to isolate an instance of a web application from
>> > other unsigned code
>>
>> This is intentional. If the web app decides to run unsigned scripts
>> (by including it as a <script>, or eval()ing it - XSS is another
>> matter and out of scope for this proposal), it is up to the web app to
>> make sure that that's secure, for example (in the case of eval) by
>> manually checking the hash with some trusted source. mega.nz already
>> does exactly that - see [1], even though without HCS, it provides no
>> real added security, because there is no trusted source. However, with
>> HCS, the whole web application suddenly becomes secure.
>>
>> The way you phrased it also makes it sound like you're worried about
>> some kind of external code (in another tab?) coming in and interfering
>> with the web app. Is that true? Subdomains are already isolated
>> origins, so if you host the "trusted" web app on a subdomain, you're
>> good to go, AFAICT.
>>
>> > A mandatory CSP policy is applied that prevents eval and inline script
>> > and sets an object and embed src of 'none'
>>
>> It sounds like you're trying to protect the user against the source
>> code. However, that's impossible. For example, you cite E2E messaging.
>> The #1 threat in E2E messaging is leaking the user's key. You
>> absolutely *have* to trust the source code not to do that, there's no
>> way around it. The only thing we can do is verify that the source code
>> is the same version every one else is running, and (with some extra
>> work) the same version as is hosted on GitHub (which is what this
>> proposal is trying to do), so that you can find out *whether* you
>> trust the source code. In other words, the only thing binary
>> transparency is trying to do is trust the *server* less, not trust the
>> source code less.
>>
>> So, if you're going to trust the source code anyway, you might as well
>> leave it up to the source code to set a suitable CSP (in a meta tag).
>>
>> > all further HTML, script and CSS loaded must be statically SRI tagged
>>
>> Yes. I can see you're trying to answer the necessary second question
>> after binary transparency, namely: "how do I make sure that the whole
>> web app is included in the binary transparency". However, there are
>> many different ways to do that, and, given that we trust the source
>> code anyway, we might as well leave it up to the web app. I've already
>> cited one other way that one web apps does it (eval after verifying
>> hashes), here's another: make one portion of a web app signed and
>> trusted, and run another portion (the front-facing UI, for example) in
>> a sandboxed iframe and communicate with postMessage. Here's another:
>> store frontend code in localStorage, and give users a manual update
>> mechanism. The point is, there are infinitely many ways to make a web
>> app like that secure, and while forcing everything in SRI is certainly
>> an easy way to verify that it is so (although - how are you going to
>> verify the integrity of Service Workers and Web Workers?), it's
>> probably a very laborious way to actually accomplish it, and likely to
>> discourage developers from trying. Easier is to say "make a list of
>> all resources in your web app, and run this script to get their hashes
>> and update your certificate". Then, if you're worried about developers
>> forgetting a resource, we could show a red warning in the console, but
>> it should not be forced upon them.
>>
>>
>> > putting the contents of resources into a very static spot like the HTTPS
>> > certificate doesn't scale and doesn't allow the appropriate agility
>> > necessary for security
>>
>> A HTTPS certificate is a very static resource now, but there's no
>> particular reason why it must be so. In fact, very short-lived
>> certificates have some advantages over long-lived ones, because they
>> obviate the need for revocation lists. Let's Encrypt is automating the
>> process of requesting a certificate. Gandi has a certificate API. I
>> don't see a reason why a non-EV certificate should take any
>> significant of time to update. certsimple.com is even issuing EV
>> certificates in three hours on average. Now, I admit that for most
>> CA's, the tooling is not there yet, but it's moving in the right
>> direction, and requesting a new certificate manually should also not
>> take that much time if you do it regularly. Then you also need an API
>> and tooling to upload your certificate to the server or reverse-proxy
>> CDN. AWS, CloudFlare, Heroku and KeyCDN have such an API. Now, again,
>> in many cases the tooling is not there yet, but it's possible and
>> people are working on those API's regardless of what I'm doing.
>>
>> > Further, requiring transparency proofs in the certificate is a nearly
>> > impossible model to develop and test under.
>>
>> While developing something unrelated to HCS, you can just use
>> localhost, or a test certificate without HCS on a test domain, or no
>> certificate at all (with --unsafely-treat-insecure-origin-as-secure in
>> Chrome if you need it), as usual.
>>
>> [1]: https://github.com/meganz/webclient#secure-boot
>>
>> -- Daniel Huigens
>>
>> P.S. I have applied for I.E.
>>
>>
>> 2017-04-25 23:34 GMT+02:00 Brad Hill <hillbrad@gmail.com>:
>> > Daniel,
>> >
>> > I would also like you to ask to please apply for a W3C account
>> > (https://www.w3.org/accounts/request) and apply to be an Invited Expert
>> > in
>> > the group.  Process here:
>> > https://www.w3.org/Consortium/Legal/2007/06-invited-expert
>> >
>> > We can't adopt any ideas you propose, and really shouldn't be discussing
>> > them as possible standards, without a contributor IPR agreement from
>> > you.
>> >
>> > thanks,
>> >
>> > Brad Hill (as chair)
>> >
>> > On Tue, Apr 25, 2017 at 2:31 PM Brad Hill <hillbrad@gmail.com> wrote:
>> >>
>> >> I must say, I don't think the threat and deployment model of this is
>> >> very
>> >> well thought out, with regards to how real web applications need to
>> >> work.
>> >> It does not address how to isolate an instance of a web application
>> >> from
>> >> other unsigned code, and putting the contents of resources into a very
>> >> static spot like the HTTPS certificate doesn't scale and doesn't allow
>> >> the
>> >> appropriate agility necessary for security.  Further, requiring
>> >> transparency
>> >> proofs in the certificate is a nearly impossible model to develop and
>> >> test
>> >> under.
>> >>
>> >> I've floated some strawman proposals around this idea previously, based
>> >> on
>> >> Facebook's desire to build E2E encryption into a web version of
>> >> Messenger,
>> >> similar to what we do with apps, where a primary goal is to have
>> >> transparency and proof of non-partition for the main application code.
>> >> (if
>> >> not all resources)
>> >>
>> >> My very rough proposal for provably transparent and non-partitioned
>> >> apps
>> >> that still work like webapps and don't have huge holes according to the
>> >> web
>> >> platform security model is roughly the following:
>> >>
>> >> Utilize suborigins as an isolation mechanism.
>> >>
>> >> Define a special suborigin label prefix for which which a resource must
>> >> meet certain conditions to enter, and accept certain conditions upon
>> >> entering.
>> >>
>> >> To enter a labeled suborigin:  The suborigin is identified by a public
>> >> key
>> >> as a special label prefix.  The primary HTML resource must supply a
>> >> signature over its body and relevant headers using that key.
>> >>
>> >> Upon entering that labeled suborigin: the effective suborigin becomes a
>> >> hash of the public key plus the hash of the bootstrap HTML resource, so
>> >> that
>> >> it is not same-origin with anything else.
>> >>
>> >> A mandatory CSP policy is applied that prevents eval and inline script
>> >> and
>> >> sets an object and embed src of 'none', and, upon entering that labeled
>> >> suborigin, all further HTML, script and CSS loaded must be statically
>> >> SRI
>> >> tagged, recursively, such that the bootstrap resource hash uniquely
>> >> identifies the entire tree of reachable executable content. This can be
>> >> the
>> >> basis of a binary transparency proof.
>> >>
>> >> In order to maintain continuity and the ability to upgrade the
>> >> application, certain pieces of state, namely local storage / indexed db
>> >> /
>> >> cookies may be shared among all applications signed with the same key,
>> >> so
>> >> that getting the latest version of the app that fixes a bug or adds a
>> >> feature in an E2E messaging product doesn't mean you lose your identity
>> >> and
>> >> all previous messages.
>> >>
>> >> When encountering a new bootstrap HTML hash, the user would be given
>> >> the
>> >> option to choose whether to trust and execute it if previous state
>> >> exists
>> >> for that signing key.  User experience TBD, but this is the point at
>> >> which a
>> >> transparency proof and gossip about partitioning could be checked, if
>> >> desired.
>> >>
>> >> -Brad
>> >>
>> >>
>> >>
>> >> On Tue, Apr 25, 2017 at 9:24 AM Daniel Huigens <d.huigens@gmail.com>
>> >> wrote:
>> >>>
>> >>> Hi Jeffrey,
>> >>>
>> >>> We're not trying to put the contents of web applications in the log.
>> >>> We're trying to put *hashes* of the contents of web applications in
>> >>> the log.
>> >>> Those are much smaller.
>> >>>
>> >>> Also, keep in mind that web applications themselves are also
>> >>> incentivized
>> >>> to keep certificates small, since large certificates mean longer load
>> >>> times.
>> >>> So if they have a web application with a thousand files, they might
>> >>> opt to
>> >>> use HCS for just 1 of them (the root html file) and SRI for everything
>> >>> else.
>> >>>
>> >>> Finally, here's a summary of all logged certificates last week [1].
>> >>> Let's
>> >>> Encrypt alone has issued over 4 million certificates this week. Even
>> >>> if a
>> >>> few hundred web applications start requesting a certificate every hour
>> >>> because of HCS (which Let's Encrypt does not allow, but some CA's do),
>> >>> that's a drop in the bucket.
>> >>>
>> >>> -- Daniel Huigens
>> >>>
>> >>> [1]: https://crt.sh/?cablint=1+week
>> >>>
>> >>> Op 25 apr. 2017 16:53 schreef "Jeffrey Yasskin" <jyasskin@google.com>:
>> >>>
>> >>> The goal of binary transparency for web applications makes sense, but
>> >>> implementing it on top of the Certificate Transparency logs seems like
>> >>> it
>> >>> introduces too many problems to be workable.
>> >>>
>> >>> Have you looked into a dedicated transparency log for applications,
>> >>> using
>> >>> the system in https://github.com/google/trillian#readme? Then we'd
>> >>> need to
>> >>> establish that only files logged to a particular set of log servers
>> >>> could be
>> >>> loaded. A certificate extension might be the right way to do that,
>> >>> since the
>> >>> certificate would only need to be re-issued in order to add log
>> >>> servers, not
>> >>> to change the contents of the site.
>> >>>
>> >>> Putting every Javascript resource from a large application into the
>> >>> log
>> >>> also might introduce too much overhead. We're working on a packaging
>> >>> format
>> >>> at https://github.com/dimich-g/webpackage/, which could reduce the
>> >>> number of
>> >>> files that need to be logged by a couple orders of magnitude.
>> >>>
>> >>> Jeffrey
>> >>>
>> >>>
>> >>> On Mon, Apr 24, 2017 at 3:25 AM, Daniel Huigens <d.huigens@gmail.com>
>> >>> wrote:
>> >>>>
>> >>>> Hi webappsec,
>> >>>>
>> >>>> A long while ago, there was some talk on public-webappsec and public-
>> >>>> web-security about verified javascript [2]. Basically, the idea was
>> >>>> to
>> >>>> have a Certificate Transparency-like mechanism for javascript code,
>> >>>> to
>> >>>> verify that everyone is running the same and intended code, and to
>> >>>> give
>> >>>> the public a mechanism to monitor the code that a web app is sending
>> >>>> out.
>> >>>>
>> >>>> We (Airborn OS) had the same idea a while ago, and thought it would
>> >>>> be a
>> >>>> good idea to piggy-back on CertTrans. Mozilla has recently also done
>> >>>> that for their Firefox builds, by generating a certificate for a
>> >>>> domain
>> >>>> name with a hash in it [3]. For the web, where there already is a
>> >>>> certificate, it seems more straight-forward to include a certificate
>> >>>> extension with the needed hashes in the certificate. Of course, we
>> >>>> would
>> >>>> need some cooperation of a Certificate Authority for that (in some
>> >>>> cases, that cooperation might be as simple, technically speaking, as
>> >>>> adding an extension ID to a whitelist, but not always).
>> >>>>
>> >>>> So, I wrote a draft specification to include hashes of expected
>> >>>> response
>> >>>> bodies to requests to specific paths in the certificate (e.g. /,
>> >>>> /index.js, /index.css), and a Firefox XUL extension to support
>> >>>> checking
>> >>>> the hashes (and we also included some hardcoded hashes to get us
>> >>>> started). However, as you probably know, XUL extensions are now being
>> >>>> phased out, so I would like to finally get something like this into a
>> >>>> spec, and then start convincing browsers, CA's, and web apps to
>> >>>> support
>> >>>> it. However, I'm not really sure what the process for creating a
>> >>>> specification is, and I'm also not experienced at writing specs.
>> >>>>
>> >>>> Anyway, please have a look at the first draft [1]. There's also some
>> >>>> more information there about what/why/how. All feedback welcome. The
>> >>>> working name is "HTTPS Content Signing", but it may make more sense
>> >>>> to
>> >>>> name it something analogous to Subresource Integrity... HTTPS
>> >>>> Resource
>> >>>> Integrity? Although that could also cause confusion.
>> >>>>
>> >>>> -- Daniel Huigens
>> >>>>
>> >>>>
>> >>>> [1]: https://github.com/twiss/hcs
>> >>>> [2]:
>> >>>>
>> >>>> https://lists.w3.org/Archives/Public/public-web-security/2014Sep/0006.html
>> >>>> [3]: https://wiki.mozilla.org/Security/Binary_Transparency
>> >>>>
>> >>>
>> >
Received on Wednesday, 26 April 2017 17:23:51 UTC