Re: Comment on minutes ## With Credentials flag etc from Jonas Sicking on 2016-01-21 (www-tag@w3.org from January 2016)

From: Jonas Sicking <jonas@sicking.cc>
Date: Wed, 20 Jan 2016 17:24:14 -0800
To: Mark Nottingham <mnot@mnot.net>
Cc: Tim Berners-Lee <timbl@w3.org>, Public TAG List <www-tag@w3.org>
Message-ID: <CA+c2ei_OnDcQF+EhsRi0EKztu4UT+HXz4+ZyARkqOQ=sb0Xc9A@mail.gmail.com>

On Mon, Jan 18, 2016 at 3:27 PM, Mark Nottingham <mnot@mnot.net> wrote:
> ... or at least the motivations behind the decisions explained. It's pretty impenetrable now, and even security folks don't profess to know all of the details behind CORS any more.

I'm bummed to hear that aspects of CORS is still confusing even to the
TAG. This stuff likely needs to get documented someplace. I had hoped
that it'd get documented in the spec, but maybe there's a better
place?

I'm not actually sure what exact confusion is being discussed in this
thread is, so I'll address some of the questions I most commonly get.

Q: What does the withCredentials flag do.
A: When it's set to false, requests are sent containing only the
information provided by the requesting website. I.e. the requesting
website's provided URL, headers and request body. The only information
that's added by the browser is information that's hardcoded into the
browser and does not depend on user information. So for example the
user-agent header. No cookies, authentication headers, or client-side
certificates are added by the browser to the request before it is sent
to the target website.

However setting withCredentials to false does not prevent the
requesting website from adding credentials through cookie headers,
authentication headers, URL parameters or any other way that's exposed
through the API which triggered the request.

Additionally, the response data that would normally affect the client
data storage is ignored. So for example set-cookie response headers
are not written to the browsers cookie storage. The returned response
is also not stored in the normal http cache, though if appropriate
browsers may store it in a specific
"CORS-requests-with-withCredentials-set-to-false" cache.

When withCredentials is set to true, requests are handled like
"normal" requests do in a browser. That means that cookies from the
users cookie storage are added based on the target URL. Cached
authentication data is added through the authentication header.

The response is likewise processed like normal, so set-cookie headers
are processed and the response is cached, if appropriate, in the
normal browser http cache.

Q: Why are security checks performed when withCredentials is set to false?
A: Because the user, and the user's browser, might be behind a
firewall and so might be able to access servers which a website would
otherwise not be able to access.

Sadly there is no, to me, known mechanism for detecting if a given
server is behind a firewall.

Q: Is it safe to always set "Access-control-allow-origin: *" on all
responses from a server.
A: As long as the server is connected to the public internet, yes it
is. It does not leak any information that couldn't be loaded using
curl or any other non-browser HTTP client.

If the server is behind a firewall and might contain sensitive
information, the header should not be added.

Q: Why does CORS not allow "Access-control-allow-origin: *" together
with withCredentials=true?
A: It was felt that this was too big of a foot gun.

CORS was designed not long after Adobe had added the crossdomain.xml
feature to Flash Player. The crossdomain.xml feature allows webserver
administrators to easily indicate that the server contains resources
that should be loadable from other origins. The feature only allowed
"normal" requests, i.e. requests similar to ones that CORS makes when
withCredentials=true.

When crossdomain.xml was released many websites opted in allowing data
to be read from other websites in order to share some public data that
was hosted on the server. Unfortunately they forgot that some other
URLs on the server served sensitive user data. The result was that
relatively quickly after the release of the crossdomain.xml multiple
websites leaked sensitive user data.

You could argue that the problem was that crossdomain.xml was
different since it is a per-server configuration file, whereas CORS
uses per-URL headers. Hence CORS would be less prone to server
administrators accidentally opting in to sharing on URLs that server
user sensitive data.

However in practice many (most?) popular web servers allow adding
configuration files which add static http headers to all URLs under a
given directory. So in practice on many servers it would have been
just as easy to make the same mistake with CORS.

Q: Why does CORS not allow listing multiple origins, or allow pattern
matching, in the "Access-control-allow-origin" header?
A: It was felt that if the server uses dynamic server-side logic to
generate responses for a given URL, that they could also then
dynamically generate the appropriate Access-control-allow-origin
header.

For servers that generate static responses you can generally simply
use "Access-control-allow-origin: *". Keep in mind that static
responses can generally be read from non-browser HTTP clients like
curl anyway.

This doesn't account for static responses which are password protected
using either cookies or auth headers. So yeah, our solution here is
not perfect, but we decided to opt for simplicity.

My personal hope was also that generic server modules would be written
to handle CORS support and which would simplify situations like this.
I'm not sure if such modules exist yet or not.

Q: CORS preflights might double the number of needed to talk to
servers that use HTTP APIs that use lots of different URLs. Why not
allow preflight answers to apply to multiple URLs
A: We originally had a solution to this problem. The solution allowed
a preflight to apply to all URLs under a given directory. However it
turned out that some popular contemporary servers handled URLs in
weird ways which made it possible to map any URL into a URL under any
other directory. I.e. a request to A/script.cgi was equivalent to
B/<stuff here>/A/script.cgi. This made it very easy for developers to
misconfigure servers in ways that attackers could take advantage of.

See also the concern above about it being easy for server
administrators to write policies for one URL forgetting that there are
other URLs in the same space which have different requirements.

However there has been some discussions about addressing this problem
on a server-wide basis for requests that have withCredentials=false.
If we do it on a server-wide basis that removes the problems around
strange URL parsing with some servers. And by only doing it for
requests with withCredentials=false it lessens the risk of leaking
sensitive user data.

I'm not sure what the latest status of these discussions are.

If I'm not addressing the concern/questions from the TAG then please
let me know.

I'd really love it if this type of information could make it into the
spec in a way that is understandable to more people.

/ Jonas

Received on Thursday, 21 January 2016 01:25:13 UTC