Re: Comment on minutes ## With Credentials flag etc from Brad Hill on 2016-04-19 (www-tag@w3.org from April 2016)

From: Brad Hill <hillbrad@fb.com>
Date: Tue, 19 Apr 2016 16:52:38 +0000
To: Tim Berners-Lee <timbl@w3.org>, Jonas Sicking <jonas@sicking.cc>
CC: Mark Nottingham <mnot@mnot.net>, Public TAG List <www-tag@w3.org>
Message-ID: <AF8BDB1A-4026-4FBE-B29C-26752D18E22B@fb.com>
I've (at last) attempted to write a developer-friendly narrative explanation of all of this stuff here.  How CORS works, the permission model, and some of the historical reasoning behind the various choices.

https://docs.google.com/document/d/1AtxTDw-g9BSRW9n9kGTTqNkDTGcVfSKPAOjVGkPFu2k/edit?usp=sharing


Yes, it's a Google doc for now since that was fastest for me to compose, but if people think it is useful, I can incorporate comments and maybe we can turn it into a WebAppSec WG note or joint TAG finding.

-Brad



On 4/1/16, 11:26 AM, "Tim Berners-Lee" <timbl@w3.org> wrote:

>Thank you Jonas, for that clarification.
>some response inline.
>
>> On 2016-01 -21, at 01:24, Jonas Sicking <jonas@sicking.cc> wrote:
>> 
>> On Mon, Jan 18, 2016 at 3:27 PM, Mark Nottingham <mnot@mnot.net> wrote:
>>> ... or at least the motivations behind the decisions explained. It's pretty impenetrable now, and even security folks don't profess to know all of the details behind CORS any more.
>> 
>> I'm bummed to hear that aspects of CORS is still confusing even to the
>> TAG. This stuff likely needs to get documented someplace. I had hoped
>> that it'd get documented in the spec, but maybe there's a better
>> place?
>
>Perhaps the problems are that it is complicated, rather arbitrary, and not derived from general principles..
>
>> 
>> I'm not actually sure what exact confusion is being discussed in this
>> thread is, so I'll address some of the questions I most commonly get.
>> 
>> Q: What does the withCredentials flag do.
>> A: When it's set to false, requests are sent containing only the
>> information provided by the requesting website. I.e. the requesting
>> website's provided URL, headers and request body. The only information
>> that's added by the browser is information that's hardcoded into the
>> browser and does not depend on user information. So for example the
>> user-agent header. No cookies, authentication headers, or client-side
>> certificates are added by the browser to the request before it is sent
>> to the target website.
>> 
>> However setting withCredentials to false does not prevent the
>> requesting website from adding credentials through cookie headers,
>> authentication headers, URL parameters or any other way that's exposed
>> through the API which triggered the request.
>> 
>> Additionally, the response data that would normally affect the client
>> data storage is ignored. So for example set-cookie response headers
>> are not written to the browsers cookie storage. The returned response
>> is also not stored in the normal http cache, though if appropriate
>> browsers may store it in a specific
>> "CORS-requests-with-withCredentials-set-to-false" cache.
>> 
>> When withCredentials is set to true, requests are handled like
>> "normal" requests do in a browser. That means that cookies from the
>> users cookie storage are added based on the target URL. Cached
>> authentication data is added through the authentication header.
>> 
>> The response is likewise processed like normal, so set-cookie headers
>> are processed and the response is cached, if appropriate, in the
>> normal browser http cache.
>> 
>> Q: Why are security checks performed when withCredentials is set to false?
>> A: Because the user, and the user's browser, might be behind a
>> firewall and so might be able to access servers which a website would
>> otherwise not be able to access.
>> 
>> Sadly there is no, to me, known mechanism for detecting if a given
>> server is behind a firewall.
>
>That’s a long rathole but ...
>1) If your local IP address is the same as the one you get from a public IP reflector then you are not behind a firewall
>2) If your IP address starts with 192.168… then you are behind a firewall …  
>3) BUT that isn’t the point, you can be outside a firewall and still have privileged access by your IP address.
>4) And you could also be behind a carrier-grade NAT box but not have any privilege access as a result. 
>
>One possible but hard route is to pursue something like the router telling your machine whether it has no privileged access, which would then enable a lot of stuff.  So public internet spaces would set the flag, which would then mean the browsers would do less preflights, wasted attempts to access stuff, etc and so the browser would run more quickly for less bandwidth.
>
>> Q: Is it safe to always set "Access-control-allow-origin: *" on all
>> responses from a server.
>> A: As long as the server is connected to the public internet, yes it
>> is. It does not leak any information that couldn't be loaded using
>> curl or any other non-browser HTTP client.
>
>> 
>> If the server is behind a firewall and might contain sensitive
>> information, the header should not be added.
>
>Well, the header should be added for any public resource.
>Some servers in fact handle the access control for the different resources on the site, and so in that case they ought to use that function to drive the headers automatically.  THAT is what should be coded up in the common servers.
>That would help the server manager do the right thing.
>
>
>> 
>> Q: Why does CORS not allow "Access-control-allow-origin: *" together
>> with withCredentials=true?
>> A: It was felt that this was too big of a foot gun.
>> 
>> CORS was designed not long after Adobe had added the crossdomain.xml
>> feature to Flash Player. The crossdomain.xml feature allows webserver
>> administrators to easily indicate that the server contains resources
>> that should be loadable from other origins. The feature only allowed
>> "normal" requests, i.e. requests similar to ones that CORS makes when
>> withCredentials=true.
>> 
>> When crossdomain.xml was released many websites opted in allowing data
>> to be read from other websites in order to share some public data that
>> was hosted on the server. Unfortunately they forgot that some other
>> URLs on the server served sensitive user data. The result was that
>> relatively quickly after the release of the crossdomain.xml multiple
>> websites leaked sensitive user data.
>> 
>> You could argue that the problem was that crossdomain.xml was
>> different since it is a per-server configuration file, whereas CORS
>> uses per-URL headers. Hence CORS would be less prone to server
>> administrators accidentally opting in to sharing on URLs that server
>> user sensitive data.
>> 
>> However in practice many (most?) popular web servers allow adding
>> configuration files which add static http headers to all URLs under a
>> given directory. So in practice on many servers it would have been
>> just as easy to make the same mistake with CORS.
>
>Any arguments about making things easy or difficult for server admins to 
>shoot themselves in the foot coming from a non-optimal attitude.
>To first order, the system must implement a security protocol which allows
>people to do the right thing — to give the right access to the right resources
>by the right people and origins.  Yes, by all means make the server
>
>Q: Why was reflecting the incoming origin in the header the thing which was picked
>as the ay of saying “yes this really is public”?  Why not “access-control-allow-origin **” or something
> It is a pain to code, needs two or three lines of not-newbie-obvious .htaccess in Apache, etc. 
>Result? the recipe is sent around
>and new server code does it by default for everything.
>
>Because CORS is such a pain for developers to deal with on the client side, with no error codes, etc
>that servers who want stuff to just work, and slap in the strongest CORS medicine they find on the net.
>
>
>
>> Q: Why does CORS not allow listing multiple origins, or allow pattern
>> matching, in the "Access-control-allow-origin" header?
>> A: It was felt that if the server uses dynamic server-side logic to
>> generate responses for a given URL, that they could also then
>> dynamically generate the appropriate Access-control-allow-origin
>> header.
>> For servers that generate static responses you can generally simply
>> use "Access-control-allow-origin: *”.
>
>Well no, not if they only want 7 specific domains to have access.
>
>> Keep in mind that static
>> responses can generally be read from non-browser HTTP clients like
>> curl anyway.
>> 
>> This doesn't account for static responses which are password protected
>> using either cookies or auth headers. So yeah, our solution here is
>> not perfect, but we decided to opt for simplicity.
>> 
>> My personal hope was also that generic server modules would be written
>> to handle CORS support and which would simplify situations like this.
>> I'm not sure if such modules exist yet or not.
>
>There are lots. They may be turned on by default.  A concern is they tend to just defeat CORS 
>and they don’t necessarily distinguish between public resources and others.
>
>Also people use CORS proxies to access the web, which are associated
>
>
>> [...]
>> 
>> 
>> If I'm not addressing the concern/questions from the TAG then please
>> let me know.
>
>I think the top two issues the TAG had is
>
>a) Having the withCredentials flag as a parameter to fetch() is broken.  In general the middleware which calls fetch() will not have magic application-level knowledge of which resources it is going to fetch are public, which are private.   So a general the fetch has to work without that hint, and do the right thing.
>
>b) For a webapp which needs to load stuff from the net, the lack of clear error conditions makes it hard to understand what is going on.    A good 
>
>c) Asking server writers to do the origin reflection thing is unreasonable
>
>
>> 
>> I'd really love it if this type of information could make it into the
>> spec in a way that is understandable to more people.
>> 
>> / Jonas
>> 
>
>
Received on Tuesday, 19 April 2016 16:53:07 UTC