Re: User Control was: SOP - was: Agenda: <keygen> being destroyed when we need it from Alex Russell on 2015-09-14 (www-tag@w3.org from September 2015)

From: Alex Russell <slightlyoff@google.com>
Date: Mon, 14 Sep 2015 02:07:16 -0700
To: Henry Story <henry.story@co-operating.systems>
Cc: Wendy Selzer <wseltzer@w3.org>, "www-tag@w3.org List" <www-tag@w3.org>, Carvalho Melvin <melvincarvalho@gmail.com>, Tim Berners-Lee <timbl@w3.org>
Message-ID: <CANr5HFWBgjY=W8fevKVuF8BCQiV7a65ztq3TKCpKGTB9+Q29jw@mail.gmail.com>
Going to try to be brief. Response inline.

On Sun, Sep 13, 2015 at 4:35 AM, Henry Story <
henry.story@co-operating.systems> wrote:

>
> > On 12 Sep 2015, at 18:54, Alex Russell <slightlyoff@google.com> wrote:
> >
> > On 12 Sep 2015 3:19 am, "Henry Story"  wrote:
> > >
> > > Thanks Alex for your very good summary.  I don't want to go over the
> > > good points made by Melvin and Anders, but to hightlight this part
> > > of your mail:
> > >
> > >> On 12 Sep 2015, at 02:20, Alex Russell <slightlyoff@google.com>
> wrote:
> > >> [snip]
> > >> Developers who want to persistent keys to the local system should
> > >> acknowledge that this is an operation that lives outside the
> > >> Same Origin Model. The  inability to scope the use of keys added
> > >> via <keygen> (via addition to the effective keychain) creates a major
> > >> hole in our one workable security primitive. It's true that this
> > >> isn't part of the <keygen> spec, but compatibility requirements
> > >> have caused this  to be true in practice. From an architectural
> > >> perspective, this alone should be enough to cause the TAG to
> > >> recommend removal of <keygen> and replacement with a better,
> > >> origin-scoped alternative.
> > >> [snip]
> > >
> > > I think this is the core of the discussion. It is this blind
> application
> > > of SOP here that I and others wish to question.
> >
> > I'm glad we're discussing this.
> >
> > You might know that the TAG recently wrote a Finding on
> > Unsanctioned Tracking:
> > http://www.w3.org/2001/tag/doc/unsanctioned-tracking/
>
> Thanks for the pointer to this finding. I wish I had referred to it before.
> It may be a place to summarise the findings from this discussion too.
>
> > It's relevant here because storage and identification mechanisms that
> > let sites persist data outside of effective user control -- and in
> > particular those that work across origins -- are ripe for use as
> > "supercookies".
> >
> > Consistent application of the SOP is what enables positive user control.
> > User action at key installation time can similarly be thought of as
> > consent. The status quo is suspect on both points.
>
> I note that the point of this paper is the distinction between sanctioned
> and unsactioned tracking. So the important legal and ethical concept is
> not the  technical one of "Same Origin" but the notion  of
> sanctioned/unsanctioned tracking: that is tracking sanctioned by the
> user who is meant to be in control. This is clear from the title of part 2
>
>       "Unsanctioned Tracking: Tracking without User Control"
>
> The Same Origin Policiy (SOP) is a technical means of avoiding information
> leakage  further than the parties involved: the browser, the user agent
> and the web agent. But it is not the same as user control as the following
> to examples should make clear:
>
> SOP without user Control:
>
> • the EU laws on cookie setting mentioned by the paper would not
> make that much sense if SOP were identical to User Control.
> The user does not usually know that cookies are being set, and it
> is usually not that easy to unset them. ( Note, that I am not
> defending those laws)
>
> • with CORS ( http://www.w3.org/TR/cors/ ) - another core application
>  of SOP - is not about user control, but about server control. Here the
> server is in control as to what information can be shared with user agents
> from given origins, when running in a web page by setting some headers
> on the published content.
>
> User Control without SOP:
>
> • In the case of client certificates, which we are discussing, the user
> is in control of the certificate to select (if any) when visiting a web
> site asking for authentication. A certificate selection box appears in
> the browser asking the user which one to select.
> ( We collected a  small  sample of these, and put them up
> https://www.w3.org/wiki/Foaf%2Bssl/Clients/CertSelection ).
>   The same is true of accepting certificates for installation in the
> keychain.
>   This is not to say that current browsers could not improve the UI putting
> the user in control, but they clearly have been applying this principle
> of user control when dealing with certificates that are then applied cross
> origin.
>   So this is a case of a non SOP feature enabling user control.
> • Hyperlinks in a web page giving the user control of which links to
>   follow, even links on the same origin.
>
> So given that one has a keychain which never signs anything without
> first asking the user, making it evident what this will be used for,
> then one cannot speak of "super cookie", which the unsanctioned tracking
> finding referred to above defines as
>
> > So-called SuperCookies use implementation bugs, browser fingerprinting
> > and other techniques to continue to identify you and correlate your
> > activity even after you clear your cookies
>
> If one switches Persona in Google Chrome to anonymous mode, that browser
> window
> no longer sends out the certificate, and should open a new TLS connection
> if
> needed. How would having used a client side certificate to log on, be any
> different
> than having used OpenID or OAuth, or just a username and password, or even
> Basic
> Auth?
>
>
> > > It is quite clear to me that if this principle were not thought to be
> > > untouchable then people would long ago have found an answer to all the
> > > other problems you and others have mentioned. The Browser vendor
> Engineers
> > > would have
> > >  - found a way to improve or replace spkac,
> > >  - a debate about how to extend the <keygen> tag so that it could
> enable
> > > a better UI, and many other features required would have lead to
> fruitful
> > > results
> > >  - taken the opportunity to work with the IETF on better certificate
> formats
> > > such as JOSE or supported work on non syntactic bound certificate
> formats by
> > > reading up on research done on the semantic web side of things
> > >  - people would have even found ways of taking a leaf from FIDO and
> find a
> > > language to limit certificate usage to certain range of applications
> > >  - there would have been enthusiastic support for improving the user
> interface
> > > of browsers to integrate WebID and make the experience extreemly user
> friendly
> > > and social network aware
> > >  - ....
> > >
> > > But of course if you believe that a certificate should only be used for
> > > the origin from which you got it, then it makes no sense to continue
> with
> > > <keygen> which allows you to generate a certificate (X509 at present)
> whose
> > > whole purpose is to safely allow you to use it cross origin ( which is
> why
> > > it is used for server authentication in TLS ).
> > >    So client-certificates-usable-across-origin is really what people
> think
> > > SOP argues against. And so from that perspective the anti SOP,
> anti-linkeability
> > > commitments of FIDO [1] makes perfect sense.
> > >
> > > But SOP is not a foundational principle of the web, which is primarily
> about
> > > linkeability historically and conceptually. SOP is essentially a
> JavaScript (JS)
> > > limitation introduced because JS introduced agentood into a
> declarative web.
> > > In addition to the agenthood of the User Agent and the User, JS
> introduced the
> > > agenthood of JS fetched from the web. This follows from JS being a
> > > procedural/functional language that can act in the browser environment
> > > by clicking links, downloading information, POSTing forms, ... SOP is
> one
> > > way of identifying JS agency, and then limiting it.
> >
> > This is less than half the story. All manner of buggery is possible via
> purely declarative forms. The browser enforces SOP in those cases as well.
> It may have seemed easier to reason about interactions between actors with
> only declarative systems in play, but we continue to be astonished at some
> of the things forms + images + CSS + iframes can do.
> >
> > JS made it clearer, faster, but browsers would be still separating
> actors with SOP regardless.
> >
> > But that's all indulgent thinking. JavaScript is a core part of the web
> stack today. We live in a world where it exists. We cannot pretend it
> doesn't.
>
> Of course. I program in JS ( well actually my Scala code compiles to JS,
> http://www.scala-js.org/ ) and am a heavy  user of many of its advanced
> features.
>
> What I am trying to do is show that SOP is not blindly applicable to
> this problem. The underlying applicability criteria for using SOP
> have to be made clear.
> SOP is a way of identifying the parties in the communication, which
> consists of
>
>   - Web Server(s) ie, origins
>   - User Agent
>   - User
>   - JS Agents coming from different Web Servers
>

This description is powerfully confused.

Origins are actors in the system ("Web Server(s)" in your list). User
agents create and enforce the origin model on behalf of Users. Neither of
them is an actor inside the security model.

"JS Agents" aren't a thing. Cross-origin content often acts within the
capabilities and heap of an origin via <script src="...">, at which point
it is indistinguishable from first-party content. This content does not
have any independent identity or security model associated with it.


> SOP is a way of limiting the leakage of information coming from
> web servers, more than it is about putting the user in control.


This is incorrect. SOP is a policy that allows users to have reasonable
assurance that when they interact with example.com, the experience they are
really talking to example.com and not some other actor.


> Rather the user ends up being in control only of what origins she
> communicates
> with, and so of information leakage knowing that the browser will
> limit information leakage between those origins.  ( as far as can go,
> since one can hide identifying information in links ).
>

It isn't information leakage that SOP primarily targets, it's capability
leakage which can lead to subversion which can lead to information leakage
(among other things).


> On a particular connection to a particular origin the user is in control
> of a number of things:
>  - what links he clicks and when
>  - how to authenticate if at all, be it
>      + by selecting a particular username/password
>      + by selecting a particular OpenID
>      + by selecting a particular client certificate
>
> >
> > > Note: It is not a particularly good way of identifying JS agency, as
> it is way too broad. Signed JS attributing it to an author or organisation
> would be a lot better. ( requiring therefore a certificate ).
> >
> > That misreads the JS capability model in a very concerning way. Signing
> might let you know who sent you the code, but once you execute code from
> multiple actors inside the same heap, and in such a dynamic language, they
> have unfettered acccess to any capability the container will grant any
> participant. This is why SOP is important at runtime: unlike signing, it
> gives us a workable actor partition (via Workers and iframes) that maps
> cleanly onto the web's process & privilege model.
>
> We need more than just where the code came from. If we want to avoid
> pointing
> to code on other servers, and so leaking information about which
> applications
> we use, we need to be able to copy code to our servers without risking the
> code written by the least ethical company undermining that of the most
> ethical
> one. Currently there is no way to distinguish code written by two
> different agents
> that I place on my server, other than creating one domain name per code
> downloaded,
> which is a bit of a blunt tool for making such distinctions.  As a result
> a lot of
> web sites such as github have removed all ability to use JS at all.
>

It's hard to be kind to this paragraph.

Deciding what server you wish to fetch code from is an origin-by-origin
decision. If you wish to avoid it, you can. I don't see what that has to do
with <keygen>.

Github absolutely permits JavaScript out on gh-pages which, you'll note,
creates a separate origin per bit of content. Relying on SOP allows github
safety for hosting untrustworthy, mutually suspicious content. Because
alternative models for isolation are much harder to implement and tend to
work against the grain of the runtime, it's reasonable for github to
dissalow 3rd-party JS on github.com.


> There has also been research about how to have code from different origins
> interact
> intelligently in a much more subtle ways, without loosing security
> features as
> described by the "COWL: A Confinement System for the Web" international
> research
> project which Google and Mozilla were involved ( see http://cowl.ws/ )
>

I'm familiar with this work. I'd like to see better confinement primitives
inside the browser, but we should note that COWL is strictly finer-grained
than SOP and bootstraps on top of it. It effectively creates dynamic
sub-origins via labels. That's doubling-down on origins, not throwing them
away.


> > > SOP applies to cookies. But the user is never asked by the chrome
> whether
> > > or not a cookie should be set. Cookies are SOP by design.
> > >
> > > So Why is SOP important for certificate usage? It is clear that any
> > > privacy enabled browser should
> > >
> > > a. on being asked for a certificate by a web site first ask the user
> > > which certificate to choose
> >
> > At TLS connection time? Modern site make tens and sometimes hundreds of
> distinct connections per page.
>
> No of course not. That would be horrible and not leave the user
> in control. Or rather the user would have no way of knowing what
> site he was connecting to before authenticating. That is what TLS
> renegotiation is for.
>
> If we limit ourselves to TLS 2 and below first, this is a common
> misunderstanding of the capabilites of TLS, and also explains
> why client certificate authentication has not been as widely used
> as it  could have.
>
> Good site design requires that: The server should first present itself
> to the user in a user friendly manner ( public front page ) served on a
> secure TLS connection, and then require authentication, but only for
> resources that are not public.
>
> ( This is different if the user is a robot, as the robot may be able to
> make decisions about the site directly from the TLS server certificate
> in a way that a human being is unlikely to be able to. )
>
> Still TLS renegotiations as pointed out previously has issues
> for HTTP/2 ( aka SPDY ) which are being looked at by the HTTP Working
> group  in combination with improvements coming from TLS3.0 . See the
> thread "Client certificates in HTTP/2"
>
>  • starting:
>    https://lists.w3.org/Archives/Public/ietf-http-wg/2015AprJun/0558.html
>  • most recent:
>    https://lists.w3.org/Archives/Public/ietf-http-wg/2015JulSep/0310.html
>
> Being able to move public key cryptography authentication to the HTTP
> layer using key material generated from keygen stored in the keychain
> would be a major improvement over the current situation making it much
> easier for any server to deploy client certificate authentication. This
> seems eminently feasible, and Tim Beners Lee's team at MIT have already
> experimented in this area.
>
> >
> > > b. show the user which certificate he is actually using during a
> > session at a site
> >
> > With which actor? The primary document? An iframe?
>
> First and formost to the site that is indicated in the URL bar, the
> prime origin.
>
> After that things are still open to User Interface Research.
>
> Contrary to appearances FIDO does not solve the problem either because
> 1) if it asks the user per origin the problem is the same and even
>   worse as the user may have to swipe his fingerprint for each origin.
> 2) if it automatically creates a public/private key per origin, then
>   all we have here are just cryptographic strength cookies
>
> The same problem would exist if each site in each iFrame asked for
> authentication using Basic Authentication, so this is not a problem
> limited to client certificates.
>
> It is basically a problem of what kind of policy one wishes to use when
> following links in cross origin application. Does one consider one's
> interaction across the web as one of a single identity, or as an identity
> per site?
>
> This problem comes to the fore in Linked Data UI research such as the
> one Tim Berners Lee is experimenting at MIT's Distributed Information
> Group (DIG), and as I have been also working on for the past years.
> If one wishes to have JS in the client follow linked data across
> origins then one is very quickly going to come across this problem, since
> it is quite likely that certain resources across the linked data web
> are in fact protected.
>

Won't it also be the case that you'll hit all manner of other sorts of
login/authentication systems that would present similar issues?


> The lack of support currently requires such applications IMHO to move
> this decision  to the server, which can then authenticate for the user
> to the various web sites following the required chosen authentication
> policy. If the user is to be in control the UI built up in JS in the client
> from this will have to be clear and understandable. But whatever happens,
> this problem needs to be considered much more closely than it yet has.
>
> If such a flexible policy were made available to JS in the browser, then
> the
> browser would be able to take over this feature from ther server.
>
> >
> > > c. enable a mode where the user is not using that certificate ( Chrome
> has the Persona UI for example ) [2]
> >
> > This is key scoping. SOP is just another form of scoping.
>
> But it's not the only one :-)
>
>
> >
> > What I'd like to understand from you is why:
> >
> > - removing <keygen> as currently shipped hurts anything
>
> It removes the ability from the browser to create client certificates
> cheaply.
> Without that client certificate authentication requires installation of
> certificates by hand, which is difficult,


This seems to be the meat of it. You want certificate mimetype handling
from the browser to make installation smoother.


> or use of external hardware devices,
> which is badly supported. I can imagine though improving this with
> integration
> with features that are coming from the FIDO alliance, such as hardware key
> cryptography storage mechanisms, which would make the private key
> completely
> unreachable other than via a precise API. Note that in this case again the
> user is put in control, but via hardware integration, and via a number of
> device specific user interfaces such as fingerprint swiping, etc...
>
> Note that WebID authentication allows any web site to produce
> useable cross origin certificates at 0$ cost ( http://webid.info/ )
>
>
> > - why the Web Crypto solution isn't strictly better
>
> The Web Crypto solution has the following limitations
>
>  1) it can only store the key in the web local storage, meaning
>  that the private key is available to all JS from that origin if
>  the extractable=true attribute is set.
>

So why is that a problem?


>  Even if the extractable=false attribute is set, this does not help
>  puting the user in control: since there is no chrome for him
>  to specify this, it is for all intents and purposes something the
>  user has no knowledge of.
>
>  Compared to this the <keygen> solution places the key in the
>  keystore, and ties the certificate to the private key after asking
>  the user. The private key cannot be accessed by any application
>  other than the keychain. The private key is safe.  This allows the
>  browser to have an identity distinct from the origin. Otherwise there
>  is no way really for anyone  to know if  the origin signed something
>  or the browser.
>

What's the difference? The browser mediates all work for origins. It is, in
essence, the arbiter of the policy already...it seems odd you think it
either isn't or shouldn't be.


>  2) the key stored in local storage cannot be used across origin.
>
>  I'll note here that Jeffrey Yasskin responded to this point in the
> blink-dev
> thread
> https://groups.google.com/a/chromium.org/d/msg/blink-dev/pX5NbX0Xack/FSW2mol3BgAJ
> by writing
>
>  > If someone runs an identity provider, perhaps with a Service Worker
>  > to work offline, relying parties can iframe the identity provider,
>  > and the identity provider can store a key in WebCrypto and prove its
>  > presence to the relying parties. With fallback request interception
>  > (https://github.com/slightlyoff/ServiceWorker/issues/684), the relying
>  > parties can also ping the identity provider from their service workers.
>
> But this is still on the drawing board, and it is really not clear yet if
> this
> does actually give us the needed capability. In any case unless we can
> agree
> that cross origin authentication is a reasonable thing to do, I don't see
> how this feature could make it to a final release, since it would be
> blocked
> by exactly the same arguments we are having here. Finally it still would
> not
> address the following point 3)
>
>  3) it does not put the user in control - as there is no tie in
>    between the Web Crypto solution and the Chrome .
>   It requires all the UI to be built by the Origin.
>
>  In short with this feature the browser looses all hold on identity and
> ends up relagating all of it to the server. This seems to me to be serious
> loss for browsers.
>
> > - what you imagine your ideal key provision solution to look like.
> >
> > These can be stylized versions, but need to include detail sufficient to
> let us discriminate, e.g. main document from iframe.
>
> The actual keygen solution seems to a good starting point, though there
> are clearly
> improvements that can be made as suggested by Microsoft, and for which Tim
> Berners
> Lee pointed out a number of possible solutions
> https://lists.w3.org/Archives/Public/www-tag/2015Sep/0034.html
>
> Potentially this could be complemented with JS APIs. But I do think that
> the
> declarative nature of keygen has some very good things going for it.
>
> >
> > > At all these stages the chrome is giving the user control of decisions.
> > > There is no JS agency that can take this over. Exactly for this reason
> SOP
> > > does not apply, and it is exactly for this reason that chrome
> integration
> > > of identity is so important.
> > >
> > > This is my analysis. Where am I wrong about the non-application of SOP?
> > > What Web Architectural Principles do you rely on to justify the
> application
> > > to this case of certificate generation and useage.
> > >
> > > Sincerely,
> > >
> > > Henry Story
> > >
> > >
> > > [1] see my previous mail "(un)linkability - Re: Agenda: <keygen> being
> destroyed when we need it" for references
> > >    https://lists.w3.org/Archives/Public/www-tag/2015Sep/0023.html
> > > [2] I realise now that  logout does not actually make sense because
> one a user has authenticated a cookie can be set to track him, or
> information kept in URLs. This should be explained somewhere.
> > >
> > >
>
>
Received on Monday, 14 September 2015 09:08:18 UTC