Re: A Somewhat Critical View of SOP (Same Origin Policy) from Henry Story on 2015-09-16 (public-web-security@w3.org from September 2015)

From: Henry Story <henry.story@co-operating.systems>
Date: Wed, 16 Sep 2015 16:59:22 +0100
To: Brad Hill <hillbrad@gmail.com>
Cc: Tony Arcieri <bascule@gmail.com>, Rigo Wenning <rigo@w3.org>, "public-web-security@w3.org" <public-web-security@w3.org>, Mike O'Neill <michael.oneill@baycloud.com>, Anders Rundgren <anders.rundgren.net@gmail.com>, public-webappsec@w3.org
Message-Id: <B0043CE6-31B5-4A61-94A0-E291D54DDBFF@co-operating.systems>
> On 15 Sep 2015, at 23:42, Brad Hill <hillbrad@gmail.com> wrote:
> 
> FIDO is not "like a cookie".  Cookies are about session and state management.  FIDO replaces passwords or certificates to provide strong authentication, and it does so in a way that is consistent with the SOP architecture of the web, so that users can be in control of how and to whom they authenticate without having to make difficult, confusing and consequential decisions about their security and privacy.  It just works, and it's safe.

You are right I did not specify the dimensions of (dis)similarity. As Blake wrote 

    To see a world in a grain of sand and heaven in a wild flower 
    Hold infinity in the palms of your hand and eternity in an hour.

Let me do a careful analysis of a number of technologies, to show how they fare
on a SOP/User Control matrix and the relation to linkability. This will also 
show the similarities and  dissimilarities between them, and should help us
advance our understanding. It's a bit long, but I hope it can be used as a basis
for a detailed report on the subject.

Cookies
=======

Cookies respect SOP by design, and help identify a session (which can last for up to a year). Identifying a session is a indirect way to identify a particular browser used 
with a particular persona, to a particular origin. At the end of this the server 
has a little data structure which could be represented in Turtle

 [] a foaf:Agent;
    controls [ a UserAgent, Chrome46.0.2490.22;
       session [ 
            origin <https://w3.org/>; 
            cookie "as342342sdaa" 
        ]  .

The Agent is only indirectly and weakly identified here via the session.

A. SOP/User Control matrix for cookies

Here the Origin is in control of setting the cookie which may be invisible 
to the user. (By default User Control is limited to deletion of cookies) 
Some countries have imposed legal requirements to make sure the web site 
makes the user aware of the setting of cookies, and gives him a choice 
to do this, which would not make sense if there were not a difference 
between SOP and User Control.

B. Linkability with Cookies

The agent is unnamed, the browser is weakly/spoofably typed, but 
the session gives a probabilistically realistic way of identifying 
the  agent over time.  

The cookie with the origin together identify the session: they constitite
an owl Key [2]. In other words the same cookie on another origin is 
just simply a different session. But the two together form an identifier, 
and can be mapped to a URI. It's just at this point there is no way to
identify an agent on one site to that on another site. 

a. Same Origin Linkability

If the user wishes to make an identification between two sessions
( say he wants to continue work started on one device on another
device ) then he will need to link two sessions on the same origin,
by using a one time password perhaps (more likely a persistent one)

This would allow the server to keep a structure such as this:

 [] a foaf:Agent;
    controls [ a UserAgent, Chrome46.0.2490.22;
       session [ 
            origin <https://w3.org/>; 
            cookie "as342342sdaa" 
        ];
     controls [ a UserAgent, Fennec9.0;
       session [ 
            origin <https://w3.org/>; 
            cookie "zoomdiboom-drlittle" 
        ]  .

This shows the importance of linkability on the same origin, showing
that these two concepts (Linkability and SOP) are distinct. This shows 
up in an important way in FIDO later.

b. X-Origin Linkability

The data structure can actually be used by the server to communicate 
about the user to other origins, where they can specify what links 
the user  clicked, how much time the agent spent on a particular 
page, etc....  The advertising industry is built on this.

The user could actually use a one time password mechanism as 
specified above to link to sessions identified by URIs on two
origins.

But more realistically the user may allow this agent linked to the session
to be tied to global identifier such a mail box, or an openid, and the 
data structure on the server will look like this then:

 [] a foaf:Agent;
    foaf:mbox <mailto:joe@university.edu>;
    foaf:openid <https://university.edu/~joe/blog/>;
    controls [ a UserAgent, Chrome46.0.2490.22;
       session [ 
            origin <https://shopping.com/>; 
            cookie "as342342sdaa" 
        ],
    controls [ a UserAgent, Fennec9.0;
       session [ 
            origin <https://shopping.com/>; 
            cookie "zoomdiboom-drlittle" 
        ]  .

So one can built up from a session to global verifiable identity 
- in this case ownership of a mailbox, or control of a web page.

This then enables linkability, which allows trust to be built up
across domains.

The user is in control of revealing the global identifiers that
allow this linkability. 

FIDO UAF
========

With UAF (after provisioning) the servers data structure could semantically 
have the following shape, as I understand from reading the online 
documentation:

 [] a foaf:Person; //<- clicked on a button, or swiped finger
    controls [ a SonyZ4Experia; // <-  cryptographic verified Authenticator
       :haskey [ 
            origin <https://w3.org/>; 
            pubKey "aedf23....."
        ]  .

FIDO UAF limits each public key to use with an origin. The same public key 
found on two origins would be surprising and worriesome ( somone else may 
be in possession of the same private key ) but would not identify the 
same user. The device type is much more strongly verified, and so is the 
Personhood of the user, who may be asked to click a button or swipe his 
thumb. This removes the need for captcha's. 

As David Longley pointed out [0] the FIDO protocol 
authenticates the public key statement in that graph and
its link to that origin.

A. SOP/User Control matrix for FIDO UAF

As opposed to cookies the user is technically put in control 
of  whether he wishes to create a public key identifier for 
his machine for that origin. I have not seen a device with a 
deregistration request but presumably it has a comprehensible 
User Interface that puts the User  in control too.

B. Linkability with FIDO UAF

The data structure of the key as can be seen is very similar to that
of cookies, but it is nearly impossible to spoof due to the 
cryptographic properties of the verification that goes with it.
( That is what I meant by saying it is like a powerful cookie. 
The difference is that the user is in control, and the extra 
strong cryptographic properties )

The public key together with the origin form an owl Key [2].
The tie in to the device is also much stronger, and so the 
identification of the Person is cryptographically certain, rather
that being probabilistic with cookies. 

a. Same origin Linkability

As with cookies a user may want to link two keys generated by
different devices under his control in order to be able to 
share information across devices. I asked this on twitter
last week https://twitter.com/bblfish/status/642009177741742080
and you  ( @hillbrad ) agreed that one way to do this would 
be a one time password. After this the server's data structure 
would be

 [] a foaf:Person; //<- clicked on a button, or swiped finger
    controls [ a SonyZ4Experia; // <-  cryptographic verified Authenticator
       :haskey [ 
            origin <https://shopping.com/>; 
            pubKey "aedf23....."
        ] ;
    controls [ a GoTrustMicroSD;
        :haskey [
            origin <https://shopping.com/>; 
            pubKey "788aff....."
        ].

Again we see how linkability is important even for the same origin. 

The user is in control here, but it may not be that difficult to fool
him. It may just be as easy as getting him to send a link to his other 
device, that the server then knows could only have been seen by 
that user for the server to be able to link the two devices.

b. X-Origin Linkability

As with cookies the key+origin data structure can actually be used
by the Origin to communicate across origins about the behavior of 
the user, which can be used for advertising. The more information
the origin gains from the interactions with the user the more 
identifying information it will be able to place in its datastructure
for that user. It does not take many bits of information as various
studies have shown to be able to narrow down on an individual.
The addition of cryptographically precise device information is one
such important bit.

If the site has any stickiness properties, and manages to engage
the user, it will not be that long before some other information 
creating linkability will be revealed. I leave it as an exercise for
the reader to show how a foaf:mbox and foaf:openid can be added to the
data structure on the server. The possiblity of this is made explicit by 
the FIDO UAF Architectural overview [1] section on other technologies
I mentioned. That type of verification requires both that the Identity
provider have a server certificate that be open to cross origin use,
for the attributes passed along by that Identity Provider ( named 
Federated Relying Party by Fido ) to have independed value. These
type of attribute verification occur using public key cryptography 
authentication across origins.

So we have a SOP based architecture that allows the user to decide if 
he wants x-origin linkability. 

JS Crypto API
=============

The JS Crypto API allows a server to run JS on the client which can
then create a public/private key stored in the local storage of the
browser. Only JS from that Origin can have access to that private key,
if the right attribute is set. ( otherwise the server can also access
the private key ). The server could then put together a graph with 
the following information:

 [] a foaf:Agent;
    controls [ a UserAgent, Chrome46.0.2490.22;
       :haskey [ 
            origin <https://shopping.com/>; 
            pubKey "aedf23....."
        ]
     ] .

Notice how this is half way between a cookie and what FIDO
offers. The browser is still weakly identified by the 
User Agent String, but there is a strong cryptographic 
identity for JS from that origin running in that browser.

A. SOP/User Control matrix of WebCrypto

The user is in control here of which origin he runs his JS on,
but is not in control of that origin's setting the 
cryptographic key material. In effet the user's control of the 
cryptographic key is the same as the users control of a cookie. 

B. Linkability of WebCrypto

a. same origin linkability

This is the same as with FIDO and cookies. The user can 
make the connection through a one time password between
too browsers or two devices.

b. x-origin linkability

Here things diverge somewhat from Cookies or FIDO.
Cookies can only be set with one User Agent for that origin, 
and FIDO only allows siging to one origin. 

With WebCrypto on the other hand the origin can
connect to multiple origins and use the private key
to tie a connection over XMLHTTP Requests to that
public key and an identifier.

For example the Origin could publish the public key 
material on the web server giving the agent and the 
key a global identity with a document such as this 

<123#> a foaf:Agent;
    controls [ a UserAgent, Chrome46.0.2490.22;
       cert:key <#key1> ;
    ] .

<#key1> cert:modulus "aedf23.....";
        cert:exponent 65536 .

This can then be used as described previously by either
  • Andrei Sambra's first sketch authentication protocol      
      https://github.com/solid/solid-spec#webid-rsa 
  • Manu Sporny's more fully fleshed out HTTP Message signature
      https://tools.ietf.org/html/draft-cavage-http-signatures-04 

to authenticate that application to different origins. Is this a big
problem? Well it gives the user less control than current built in 
X509 Certificate authentication, since the chrome there asks the user
what certificate to use. On the other hand the server could do the 
connection itself via a CORS proxy to the other origins and get the
same result. So there is not much that can be done.

The Single Page Application here fills in the role of what the Browser
Chrome perviously did with X509+TLS . The problem for the web if client
certificates are no longer present in the browser ( or some equivalent
xross origin authentication as proposed by Manu or Andrei) is that 
if the user  wishes to inspect the remote access controlled resources 
independently of the Single Page Application it will have difficulty
doing so, because the Browser no longer has an identity it can use 
apart from that of the origin. 

Client Certificate + TLS
========================

With <keygen> ( or an improved version of it we hope for ) the browser
can without JS receive a certificate from an origin that the browser 
can use across origins.

A. SOP/User Control matrix of WebCrypto

As shown in the video presented at the Identity Workshop in April 
2011 [4] the browser puts the user in control of 
 • creating the public private key - it's a form submission, a 
  technology on which e-commerce is based
 • on receiving the certificate that it has been received, and in 
  some browsers if it should be installed
 • on connecting to a remote site that asks for the certificate 
  which if any certificate should be used

As with the JS Crypto API the certificate can be used across origins, but 
the limitation to being used by only one application is lifted. ( this 
is ok because the user is in control )

This comes with a lot of the cryptographic advantages of FIDO 
except that the linkability is built in. So this provides more efficiently 
what FIDO + OpendId or FIDO+OAuth provide.

Furthermore this technology is not limited to the current use of TLS. One could
use an HTTP mechanism that would fit neatly with HTTP/2.0 ( SPDY ) as described
in the links I placed in the FIDO section regarding Andrei and Manu's protocol.
Or it could also wait for improvements worked upon in the TLS 1.3 group 
and discussed on the HTTP mailing list

• starting: https://lists.w3.org/Archives/Public/ietf-http-wg/2015AprJun/0558.html
• most recent: https://lists.w3.org/Archives/Public/ietf-http-wg/2015JulSep/0310.html
 
It is clear that this is not in competition with FIDO or WebCrypto, but complementary
to those. I can even imagine them being integrated at some point.

What is the advantage of this?

Well we can imagine starting with FIDO or WebCrypt on a web site to create an account
which we later fill in with a lot of information ( some of it public a lot of it 
protected [5] ) but that we later want to avoid duplicating that information across web 
sites. Perhaps we actually don't want to create a new account on one web site but want
to identify with an existing account from a different origin, so that 
when we change information there, we don't need to change it everywhere else. 
We also need this to crete distributed social networks.

Ok, so I hope this shows why an analytic philosophy degree combined with 30 years
of computing experience is useful :-)

> 
> The diagrams you point to show how FIDO can be a secure initial-authentication step to an entity that can provide identity assertions or claims in a federated manner through existing protocols that also build on top of and work within the web and the SOP.  These kinds of systems are already successfully deployed in many contexts, for many years, with users in the billions.  Decoupling authentication from identity, and building both over and in concert with the core security and privacy architecture of the web, supports user choice and permissionless innovation.

Does FIDO have users in the billions? You mean OpenID, OAuth and cookies I suppose.

OpenID & OAuth enable x-origin authentication. FIDO by design does not but requires
those other protocols to do what billions of people need to do.
 
> 
> Meanwhile, in the <keygen> + x.509 + TLS client cert world, the privacy-oblivious, inextricable conflation of identity and authentication makes intrusive user experiences mandatory.  These experiences are inconsistent, have barely evolved in 20 years, and there is no evidence that users actually understand the consequences of the decisions they are being asked to make.

X509 + TLS is not privacy oblivious. It would be that if the user interface in the
browser did not give the user control of when and where he uses a certificate. But
it does. So at that level it is not more privacy oblivious than OpenID or OAuth.
It's a tool for a particual job.

I have shown how you can build up from WebCrypto to a global identity. This 
is not dissimilar to the way FIDO works with OpenID and OAuth.  You can 
start with a crypto identifier, and then tie that to a global one, and 
use the one to prove ownership of the other.

What imports us is not X509 and TLS, but the ability to do cross origin 
authentication, which is here to stay with FIDO and will continue 
to spread, as OpenID and OAuth show, and as will become clear with deployment
of WebID and WebKey technologies.

>  User Choice as a principle doesn't mean that we should ask the user to make choices they don't understand or which can harm them.  ActiveX popped up plenty of "opportunities for choice" and gave me the "choice" to run arbitrary code in the course of ordinary web activities, but it was fundamentally user-hostile because it enabled designs that forced the user to accept risk and harm in order to access services on the web.  

We agree here.

ActiveX gave developer choice not user choice. Those are very different.
You are making the case for <keygen>: it offers simplicity of use over the 
complexity-but-power of the WebCrypto API. <keygen> gives users the ability 
to control its use. WebCrypto gives developers the ability to build things, 
but not being tied into the chrome, it does not give the user control.

> The SOP is exactly about enabling user choice to freely browse the web and interact with services without having to be concerned about certain classes of harm.  It's not perfect, but it's the best we've come up with, and the more consistent we are about it, the better it works.

I think I have shown that SOP is not the key criteria to judge these technologies 
on. What is interesting in FIDO is user control. If SOP were the full answer 
then WebCrypto would have all the good properties of FIDO too. Things are 
just more complicated than the simple SOP/not-SOP dichotomy may lead one to believe. 
We have at least three dimensions to consider: SOP, User Control and linkability.

> 
> <keygen> entangles being identified with being authenticated,

Here I'll say I am in completely agreement with David Longley's recent
reply  [0].

Not more than OAuth does or OpenID. They both use cryptographic to prove statements.
Manu Sporny's HTML Signature and Andrei Sambra's WebID-RSA prototype show how you can
start with public keys and use those to prove an identity without using certificates,
or X509, or TLS.  This is not different to what is actually happening with 
OpenID or OAuth. 

FIDO does not do it, but that does not mean that it is bad. The proof is you cite
these other technologies in your architectural document. 

> locks the experiences and evolution of the direct relationships between users and services to the inconsistent and slow moving world of browser UI, 

So you are saying hardware players can move faster than browsers, that they can do
a better UI that browser vendors? It's a bit insulting to browser vendors, but you
seem to be getting away with it :-)

> violates the SOP and forces the cost of that damage onto the user, and it puts authentication at a layer (the TLS handshake) where it is fundamentally problematic to the commonplace scalability and performance architecture of anything but hobbyist-level applications.  

As mentioned above none of this requires TLS client certificate authentication. We have
used it, because it actually works in most browsers. With the appearance of HTTP 2.0 
the same concepts can be applied can be moved to the HTTP layer.

What we are arguing here for is not <keygen> but the capability of the browsers to
keep their own cross origin, user controlled identity. All that they need to 
do is evolve it. 

> 
> That's why it's being deprecated.

A bit too fast IMHO. In any case you need not take sides on this. The existence
of <keygen> should not be problematic for FIDO.

Henry

[0] https://lists.w3.org/Archives/Public/public-webappsec/2015Sep/0100.html
[1] https://fidoalliance.org/wp-content/uploads/html/fido-uaf-overview-v1.0-ps-20141208.html
[2] http://www.w3.org/TR/2012/REC-owl2-primer-20121211/#Keys
[3] https://lists.w3.org/Archives/Public/public-webappsec/2015Sep/0088.html
[4] http://bblfish.net/blog/2011/05/25/
[5] see the Web ACL work on the http://webid.info/spec/ page

> 
> On Tue, Sep 15, 2015 at 2:29 PM Henry Story <henry.story@co-operating.systems> wrote:
>> On 15 Sep 2015, at 21:14, Tony Arcieri <bascule@gmail.com> wrote:
>> 
>> On Mon, Sep 14, 2015 at 10:08 AM, Rigo Wenning <rigo@w3.org> wrote:
>> The same argumentation has already be used during the rechartering of the WebCrypto Group. The privacy argument used by people from one of the largest origins is funny at best. If I use my token with A and I use my token with B, A and B have to communicate to find out that I used them both.
>> 
>> Speaking as someone who attended WebCrypto Next Steps, the common theme to me was actually a fundamental incompatibility between PKCS#11 APIs and how web browsers operate. Many talks alluded to some sort of "bridge" or "gateway" or "missing puzzle piece" to connect the Web to PKCS#11 hardware tokens. Unfortunately there were no concrete proposals from either a technical or UX perspective. It was mostly a dream from all of the vendors, realized in slightly different vague handwavy visions, of how someone could swoop in and magically solve this problem for everyone. Clearly dreams without actual technical proposals didn't go anywhere.
>> 
>> The reality is the SOP is the foundational security principle of the web. Period. Introducing SOP violations is a great way to ensure browser vendors don't adopt proposals.
> 
> SOP is a technical Principle, which is trumped by the legal principle of User Control. I made this point in the thread to the TAG
> https://lists.w3.org/Archives/Public/www-tag/2015Sep/0038.html
> and it fits the TAG finding on Unsactioned Tracking
> http://www.w3.org/2001/tag/doc/unsanctioned-tracking/
> 
> Current browsers respect user control with regard to certificates - some better than others. It can be improved, but that can best be done through legal and political pressure.
> 
> If the user is asked if she wants to authenticate with a global ID to a web site, then that is her prerogative.
> As long as she can select the identity she wishes to use, and change identity when she wants to, or become anonymous: she must be in control. 
> 
> Privacy is actually improved in distributed yet connected services. By having distributed co-operating organisation each in control of their information, each can retain their autonomy. As an example it should be quite obvious the the police cannot share their files with those of the major social networks, nor would those have time to build the tools for the health industry, nor would those work for universities, etc. etc... The world is a sea of independent agencies that need to look after their own data, and share it with others when needed. In order to share the data, they need to give other agents access, they need to *control* access, which means you need some form of global identities that can be as weak as temporary pseudonyms, or stronger. In fact they can evolve out of pseudonyms into strong identifiers on which a reputation has built itself.
> 
> Of course anonymous or one site identities should be the basis of that and FIDO provides that, but not more.
> On top of FIDO you can see OpenID, SAML and OAuth profiling themselves very clearly in the FIDO UAF Architectural Overview:
> 
> https://fidoalliance.org/wp-content/uploads/html/fido-uaf-overview-v1.0-ps-20141208.html#relationship-to-other-technologies
> 
> These global identities based on URIs won't go away in a Web that is a web of billions of organisations, individuals and things, connected and interconnected. 
> 
>> 
>> Going back to your original point though, what you're describing is of course extremely commonplace on the web. Web sites often leverage multiple advertising and analytics networks, so "A and B communicating" (perhaps vicariously via ad or analytic network C) is so exceedingly commonplace I'm not sure what you're even suggesting. This already happens practically everywhere all the time, to many commonly shared third parties who very much want stable identifiers to link users.
>> 
>> There are of course ample other signals by which ad and analytics networks can track people with (IP addresses and "supercookies" certainly come to mind), but by brushing it off, you're actually suggesting that cryptographic traceability should be built into the fundamental cryptographic architecture of the Web. This is a slippery slope argument, i.e. things are bad, so why not make them worse?
>> 
>> There's no going back from that, short of throwing it all away (as what is likely to happen to the <keygen> tag soon) and starting over from scratch (ala FIDO).
> 
> Here is the picture from the architectural UAF document mentioned above. (It fails to mention that in many cases after an OAuth or OpenID the Relying Party communicates with the Federation Party until recently called the identity provider.)  So really FIDO is just setting a super strong cookie with some cryptographic properties which then more and more often needs to be bolted onto actual Identity Providers. All of this relies on server side cryptographic keys tied to TLS, so that the major parties are in effect using certificates for global authentication.
> 
> <PastedGraphic-1.png>
> 
> ( They could have also have added WebID to the mix http://webid.info/ but that is anoyingly simple, which is why I like it :-)
> 
> Now WebID is not tied logically to TLS [1]. We have some interesting prototypes that show how WebID could work with pure HTTP.
>   • Andrei Sambra's first sketch with 
>       https://github.com/solid/solid-spec#webid-rsa
>   • Manu Sporny's more fully fleshed out HTTP Message signature
>       https://tools.ietf.org/html/draft-cavage-http-signatures-04
> these two could be improved and merged.
> 
> This would allow a move to use strong identity in  HTTP/2.0 ( SPDY ) , which could then be supported by the browsers who could build User Interfaces that give the users control of their identity with user interfaces described by your Credentials Management document developed here
>   https://w3c.github.io/webappsec/specs/credentialmanagement/#user-mediated-selection
> 
> Funnily enough one should already be able to try Andrei or Manu's protocol in the Browser with JS, WebCrypto, and ServiceWorkers [2]. There are many things to explore here [3] and the technology is still very new. [4]
> 
> But this shows that WebCrypto actually allows one to do authentication across origins. ( And how could
> it not, since it is a set of  low level crypto primitives? ). A Single Page application from one site can publish
> a public key and authenticate to any other site with that key. 
> 
> So its a bit weird that SOP is invoked to remove functionality that puts the user in control in the browser, 
> and that WebCrypto is then touted simultaneously as the answer, when in fact WebCrypto allows Single Page Applications (SPA) to do authenticate across Origins, and do what the browser will no longer be able to do.
> Lets make sure the browser can also have an identity. It's an application after all!
> 
> Let's stop using SOP as a way to shut down intelligent conversation. Let's think about user control as the aim.
> 
> Henry
> 
> 
> [1] as it may seem from the TLS-spec which we developed first because it actually worked.
> http://www.w3.org/2005/Incubator/webid/spec/
> [2] http://www.html5rocks.com/en/tutorials/service-worker/introduction/
> [3] https://lists.w3.org/Archives/Public/www-tag/2015Sep/0051.html
> [4] which is why it is wrong to remove 15 year old <keygen> technology on which a lot of people depend 
> around the world as explained so well by Dirk Willem Van Gulik on the blink thread before a
> successor is actually verified to be working. 
>  See Dirk's messages here:
>     https://groups.google.com/a/chromium.org/d/msg/blink-dev/pX5NbX0Xack/GnxmmtxSAgAJ 
>     https://groups.google.com/a/chromium.org/d/msg/blink-dev/pX5NbX0Xack/4kKNMVCdAgAJ
>  Also note that Japan is moving to eID
>     http://www.securitydocumentworld.com/article-details/i/12298/
>  that sweden has a huge number of people there, and that it might not be too good to get noticed by
> states: Russia and the EU just  this month started anti trust investigations  against Google for example.
>    
>> 
>> -- 
>> Tony Arcieri
Received on Wednesday, 16 September 2015 16:00:09 UTC