[CORS] Security models and confusion about credentials from Austin William Wright on 2013-09-01 (public-webappsec@w3.org from September 2013)

From: Austin William Wright <aaa@bzfx.net>
Date: Sun, 1 Sep 2013 13:46:18 -0700
To: public-webappsec@w3.org
Message-ID: <CANkuk-XFXCg2RQGi9An9oPFwmNgym=ifj8nsid2KA5CufnVutw@mail.gmail.com>
I've found the Origin security model to be ill-suited for the Web, and I'm
trying to move my application away from the origin security model entirely,
and towards a bearer security model where the possession of credentials (a
"bearer token" or "reference") to a resource is what grants access to that
resource. Allow me to formulate my understanding of Web security models, so
I can discuss some concerns with the CORS specification. I'm unsure if
there's a better name than "bearer security model", I haven't seen any.

In the origin security model, access is policed by the user-agent and
granted to any resource within the same "origin". Origins are generically
defined, which isn't suitable for all (or even most) applications. It
wrongly assumes that all content within a domain/port/scheme tuple will
have mutual trust. As a consequence, if I operate a wiki or document
publishing service to users, I must only allow a subset of HTML, otherwise
users could craft malicious documents that would be insecure in many user
agents (specifically, Web browsers). This security model isn't even secure
per se (by itself): If you allow any sort of cross-domain
forms/embedding/linking, then we'll find that additional security measures
are necessary to prevent a "Confused Deputy" attack (the CSRF token I
describe below).

I believe a better security model for the Web is the bearer security model.
The bearer security model requires that if any resource within a user agent
wishes to operate on another resource, it needs a reference to that
resource it wishes to view or modify (in the form of credentials, or a
"bearer token"). The CSRF token is one example of implementation of the
bearer security model: Third party websites do not have access to a
CSRF/bearer token, and are thus unable to modify the target resource.
Finally, the bearer security model also works for resources which have no
origin, like the local filesystem.

A URI is not used as a reference/bearer token. Third parties, including
unauthorized servers and user agents, still need to talk _about_ a resource
without having access _to_ it. Instead of using the resource identifier as
the grant of permission (as a computer program might, where possessing a
pointer might imply an ability to read or write to that memory), the bearer
token is instead passed via request metadata, the "user credentials"
headers in CORS. Additionally, it is useful to encode metadata into the
token, like an associated user. I adopt the RFC 5760 OAuth Bearer Token
usage wherever possible: In XMLHttpRequest and wherever else possible, it
is passed as an `Authorization: Bearer` header; in forms, I use the
access_token field (this is functionally the CSRF token too); and for
identifying users on a website, a cookie (always sandboxed into read-only
access).

Of course, due to the abilities that the origin security model grants,
fully using a bearer security model isn't going to be possible (at least
right away). For instance, arbitrary user scripts would still have access
to cookies and read access to same-origin resources with "simple" requests.
(Of course, allowing arbitrary scripts would still be a bad idea for a
number of other reasons.)

What I would like to do, however, is enable features of my application to
work using XMLHttpRequest, for within my application (user content and
trusted content), and access from third party websites (if so granted). In
order to do this, I need to make sure that whenever a third party script
makes an HTTP request, they do not do so with any user-agent data like
stored cookies or stored Authorization sessions, only data as explicitly
set by the calling script. If a script were able to make a readable request
and it was sent with cookie data, they would be able to read Bearer
tokens/CSRF tokens off a form. That is, it would be unintentional sharing
of permission-granting references, which violates the basis of the bearer
security model.

This requirement, that requests not be made with credentials
(user-agent-stored cookies and such) applies even to same-origin requests.
I've been trying to read the CORS TR (now Proposed Recommendation) to see
how to do this, and I'm left somewhat confused.

I feel the CORS specification still needs significant work, not major, but
elaboration or accommodations for Web applications like as I've described.

In four places, the report says:

    Note: [CORS] The string "*" cannot be used for a resource that supports
credentials.

One, this note is only informative, where is this normatively specified? I
think the term "supports credentials" is very misleading. The question is
not whether the resource "supports" credentials, but whether it's willing
to accept default, user-agent-stored credentials (which almost certainly it
should not, if the resource is going to be readable by third parties). A
better term might be "accept credentials" or "send stored credentials".

Two, is there some guarantee that a CORS-compliant user-agent will not send
credentials with a request that will be readable?

Is this true even for same-origin requests, maybe using some work-around
(like not permitting the Origin header to appear with Cookie or
Authorization Basic)?

Three, I desire to expose all the HTTP headers to XMLHttpRequest and other
APIs (on the basis that if I didn't want the user-agent to know about it, I
wouldn't send the header). Is there some reason this might be dangerous?
And is there some method of specifying this? I cannot be sure of which
headers, exactly, will be sent with the actual request, when processing the
pre-flight request. It seems like the solution is to list every single
header I could possibly send as `Access-Control-Expose-Headers`, which
seems excessive.

I have some other, somewhat unrelated questions and comments:

One, why are they called "credentials"? The intent should be to not send
/any/ user-agent-stored information. Credentials may be the best term I can
think of, but I still consider it misleading. (Perhaps "stored information"
or "stored data", and thus "send stored credentials" becomes "send stored
data". Then again, perhaps this is too generic a term.)

And two, section 6.1 "Simple Cross-Origin Request, Actual Request, and
Redirects" has this note:

    Note: By not adding the appropriate headers resource can also clear the
preflight result cache of all entries where origin is a case-sensitive
match for the value of the Origin header and url is a case-sensitive match
for the URL of the resource.

I don't believe this makes grammatical sense (is there a missing "the"
before "resource"?)

Austin Wright.
Received on Sunday, 1 September 2013 20:46:46 UTC