Re: [AC] Helping server admins not making mistakes from Thomas Roessler on 2008-06-13 (public-webapps@w3.org from April to June 2008)

From: Thomas Roessler <tlr@w3.org>
Date: Fri, 13 Jun 2008 18:07:44 +0200
To: Jonas Sicking <jonas@sicking.cc>
Cc: "WAF WG (public)" <public-appformats@w3.org>, public-webapps@w3.org
Message-ID: <20080613160744.GJ306@iCoaster.does-not-exist.org>
I think we've both been arguing this all over the place, and the
thread might be getting a bit incoherent.

So let's try to start over...

The question here is whether it makes sense to add fine-grained
controls to the authorization mechanisms to control -- in addition
to whether or not cross-site requests are permitted at all --:

  (a) whether or not cookies are sent
  (b) what HTTP methods can be used in cross-site requests.

I have two basic points:

1. *If* we have to have that kind of fine-grained controls, let's
please do them coherently, and within the same framework.  The
argument here is simply consistency.

2. We shouldn't do (a) above, for several reasons:

 - it adds complexity
 - it adds confusion (witness this thread)
 - it's pointless

I don't think I articulated the thinking behind the third of these
reasons very clearly.  The whole point of the access-control model
(with pre-flight check and all that) is that requests that can be
caused to come from the user's browser are more dangerous than
requests that a third party can make itself.

Consider a.example.com and b.example.com.  Alice has an account with
a.example.com and can wreak some havoc there through requests that
have the right authentication headers.

The purpose of having the access-control mechanism is:

- to prevent b.example.com from reading information at a.example.com
  *using* *Alice's* *credentials* (because b.example.com can also
  just send HTTP requests from its own server), unless specifically
  authorized

- to prevent b.example.com from causing non-GET requests to occur at
  b.example.com *using* *Alice's* *credentials* (because
  b.example.com can also just send HTTP requests from its own
  server), unless specifically authorized

So, if there is an additional way to authorize third-party requests,
but without Alice's credentials, we're effectively introducing an
authorization regime for the same requests that our attacker could
send through the network anyway, by using their own server -- modulo
source IP address, that is.  Is that really worth the extra
complexity, both spec, implementation, and deployment wise?  I don't
think so.

(Oh, and what does a "no cookies" primitive mean in the presence of
VPNs or TLS client certificates?)


About the methods point, my concern is that the same people who are
clueless about methods when writing web applications will be
clueless about the policies.

Hope this helps,
-- 
Thomas Roessler, W3C  <tlr@w3.org>





On 2008-06-11 15:30:22 -0700, Jonas Sicking wrote:
> From: Jonas Sicking <jonas@sicking.cc>
> To: Jonas Sicking <jonas@sicking.cc>,
> 	"WAF WG (public)" <public-appformats@w3.org>, public-webapps@w3.org
> Date: Wed, 11 Jun 2008 15:30:22 -0700
> Subject: Re: [AC] Helping server admins not making mistakes
> List-Id: <public-appformats.w3.org>
> X-Spam-Level: 
> Archived-At: <http://www.w3.org/mid/485051FE.1000508@sicking.cc>
> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.1.6
> 
>
> Thomas Roessler wrote:
>> On 2008-06-10 16:41:41 -0700, Jonas Sicking wrote:
>>
>>>>> Getting access to a users cookie information is no small task. 
>>>> I disagree.  There are any numbers of real-world scenarios in which
>>>> cookies are regularly leaked - JavaScript that's loaded from
>>>> untrusted sources, and captive portals are just two examples which
>>>> make people bleed cookies.  Basing the design here on the premise
>>>> that cookie-based authentication should somehow be enough to protect
>>>> against cross-site request forgery strikes me as unwise, in
>>>> particular when the cost is in additional complexity (and therefore
>>>> risk).
>>> Well, if you can get access to a users cookies and auth information then 
>>> nothing that we do here matters at all. Or at least matters to a much much 
>>> smaller extent. This whole spec is basically here precisely to protect the 
>>> information that is protected by cookies and auth headers (and for most 
>>> sites only cookies).
>>
>> Ooops, I let your remark about "getting access to a users cookie
>> information" above lead me on a bogus path of argument.  That's what
>> I get for doing e-mail on a train. :-)
>>
>> The fundamental point is that (a) GET request and (b) some POST
>> requests can be made with the user's cookies.  As far as web
>> application development is concerned, it's probably a really good
>> idea to consider the cross-site request horse (as opposed to the
>> cross-site information access horse) to have left the barn (even if
>> it hasn't left the barn totally).
>>
>> Now, the working group evidently doesn't want to make that
>> assumption for generic POST requests (which is fine); that's why we
>> have the pre-flight check.  That's fine with me.
>>
>> However, we're now getting to a point at which we try to prevent the
>> occurence of certain cross-site requests, specifically with cookies.
>> The cost here is complexity in the protocol design and
>> implementation (which is complexity to the deployer).
>
> Hmm.. I'm confused. All requests we are trying to prevent are ones with
> cookies. First you say that you are ok with the pre-flight
> request, but then you say that you don't want to prevent certain
> cross-site requests. Isn't that the whole point of the pre-flight.
>
> Or are you saying that you're ok with
>
>> The benefit is questionable at best, in particular if you consider
>> POST: There are some POST requests that, under the proposed model,
>> couldn't be made using XMLHttpRequest, but could be made using
>> form.submit().  Do you really expect anybody to still understand
>> that?
>
> It is certainly not ideal, but it's better than any alternative I think. 
> Also note that this is already the case with the spec as is, without my 
> newly proposed additions.
>
> With the spec as is it is already not possible to do cross-site POSTs with 
> content-type text/plain using XHR without opt-in, while it is possible using 
> <form>.
>
> The only solutions to this that I can think of are these:
>
> 1. Leave things as is and accept that Access-Control prevents certain
>    requests (without opt-in) that <form> already allows.
> 2. Modify Access-Control such that it allows exactly those requests that
>    are currently possible using <form> without opt-in. Anything that
>    is not possible using <form> would require opt-in.
> 3. Modify Access-Control such that it allows a superset of what is
>    possible using <form>.
>
> It seems to me like 2 would add a lot of complexity to the spec, we would 
> have to write rules saying that only when the content type is "text/plain" 
> or "multipart/form-data" then allow. This seems really messy to me and it 
> also makes it even harder to stricten up <form> in the future (something 
> that would *greatly* help CSRF prevention).
>
> Option 3 scares me a lot. For example SOAP servers have been brought up in 
> the past that currently might only be protected by the content type never 
> being "text/xml" from <form>. I have also heard of sites that implement CSRF 
> protection today by submitting their forms using XHR and setting a custom 
> header. They thereby rely on the fact that you can't cross-site POST today 
> while setting custom headers.
>
> So while I agree 1 is non-ideal, it seems like the least bad option. I'm not 
> that worried that people will simply forget about the current security model 
> of the web, just because we publish the Access-Control spec.
>
>> When seeing questionable benefits on the one hand, and an
>> increasingly messy and incomprehensible design on the other, then
>> I'd rather see us not make this case.
>>
>>>> I was getting at the fact that the recent change proposals are
>>>> moving away from scenarios in which the web application author
>>>> writes a policy in a simple language, and get more and more into a
>>>> scenario in which this policy is broken into pieces that are spread
>>>> across multiple headers.
>>>>
>>>> I think the sane options are either an extension to the policy
>>>> language (so that there is one model to look up), e.g.:
>>>>
>>>> 	allow method post with cookies with http-auth oauth from w3.org \
>>>> 	      except people.w3.org
>>> I'm less concerned about what exact syntax we use than what features we 
>>> are providing. As long as it is easy to understand.
>>
>> I agree on that; the syntax that I put in up there is probably not
>> even consistent with what's in the draft.  It was meant as an
>> example only.
>>
>>> That said, I do want to avoid the situation we had before where
>>> you had to specify in far too great detail your policy, such as
>>> for each allowed site separately list the allowed methods for
>>> that site.
>>
>> Well, *that*, for one, isn't a syntactic question, as it implies
>> that certain policies couldn't be implementable without having to
>> use server-based techniques (enabled thanks to the origin header).
>
> Yes, I do think that people would be able to do server-side enforcement of 
> the headers they only want to allow for some 3rd party sites. My concern is 
> headers that they aren't even thinking of exist, or ones that are added 
> support for after they deploy Access-Control support.
>
> However I still don't see an important distinction between the header you 
> proposed above and the one I proposed. So if your proposal alleviates 
> peoples concern about complexity then I am fine with that.
>
>>> Other than that I don't think it matters much, it's just a syntactic 
>>> question.
>>>
>>>> ... or a model where we do away with the notion of a policy language
>>>> entirely, and rely on the Access-Control-Origin (or whatever it's
>>>> called this week) and the server making a decision as our main
>>>> mechanism.  This goes back to Mark Nottingham's ideas from January,
>>>> when he suggested that using HTTP Vary headers and Referer-Root was
>>>> indeed enough.
>>>>
>>>> For an unsafe method, the server indicates that it's prepared to
>>>> deal with cross-site requests during the pre-flight phase, and then
>>>> makes its real decision at the time of the request, based on that
>>>> request's Access-Control-Origin.
>>> This is still relying on that the server is able to deal with the unsafe 
>>> request. I.e. the server is still forced to opt in to nothing, or opt in 
>>> to all possible combinations of http methods and headers.
>>
>> Yes, that assumption is in there.
>
> Which is exactly what my proposal was trying to prevent. That was the whole 
> point of it. If we don't archive that then my concerns have not been 
> alleviated.
>
>>> So it seems no different from what the spec says today in that regard.
>>
>> The difference is that we throw out the idea of having a complex
>> policy engine of any sort in the client.
>
> How is that different from what we have today in the spec?
>
>>>> For safe methods, the server can just use HTTP error codes.
>>
>>>> While this is indeed a differnt model, it seems to be the one that a
>>>> lot of the discussion here is edging toward -- e.g., the defenses
>>>> against decoupling the preflight check from the "real" request all
>>>> rely on server-side enforcement of the policy.
>>
>>>> What I'd really like to see us avoid is a scenario in which we're
>>>> creating a mess by mixing the various models to the extent that the
>>>> entire model becomes incomprehensible.
>>
>>> I don't think the model is especially complicated for the server 
>>> administrator, even with the separate headers. It seems to me that the 
>>> number of headers is a poor measurement of the complexity of the spec.
>>
>> I'm not even claiming that the number of headers is a measurement of
>> the complexity of the spec.  What I am, however, saying is that
>> having some things configurable per server, and others globally, is
>> a recipe for confusion (in particular to people who don't need to
>> touch their policies daily), and that a simpler model is preferable
>> over that.
>
> With my proposal you can still rely on the client to enforce different 
> policies per server if you so wish. You can simply send different header 
> values depending on what the (Access-Control-)Origin says. But you are not 
> forced to as with the previous spec.
>
> Or you can simply always opt in to all the headers that you expect to deal 
> with from any server, and then use server-side logic to enforce more complex 
> policies if you so wish.
>
>>>>> The smaller portions a server opts in to, the smaller the risk
>>>>> that they accidentally opts in to something where they
>>>>> accidentally shoot themselves in the foot.
>>
>>>> While I sympathize with that notion, I think that the current
>>>> approach (mixing a policy language with headers that possibly need
>>>> to be set differently for different sites) is likely to mess up
>>>> things further and make analysis harder.
>>
>>>> I do think that we server site authors best by making things simple,
>>>> easy and consistent.  That isn't the case with the model getting
>>>> ever more baroque.  Sorry.
>>
>>> I guess it depends on what you define as "simple". I think of it as the 
>>> fewer things you have to keep track of the simpler it is. I think having 
>>> to keep track of the full set of http features for the server is more than 
>>> making sure to get one or two extra headers right.
>>
>> If you look at information flows (and at your own arguments in
>> another branch of this very thread), then the current model indeed
>> requires you to deal with all kind of weird behavior, since the
>> ability of a script to exercise possibly arcane and unspecified
>> levers that might control what an HTTP response looks like will
>> translate into the ability of that script to perform cross-site
>> requests.  The more different headers and different code paths are
>> involved with that, the higher the risk that things actually go wrong.
>
> Agreed. Though I assume that by "arcane and unspecified levers" you mean 
> things like random http headers and http methods. Including ones that are 
> totally unrelated to this spec.
>
>> A model that (a) simplifies the language, and (b) minimizes the code
>> paths that can be used to control the server's behavior would seem
>> the more sane one, and would seem to be the one that's easier to
>> control for server administrators, e.g. from filters that they can
>> put in front of their web application.
>
> Agreed. But to archive (b) I do think we need my proposal since otherwise as 
> soon as someone opts in a 3rd party server can hit all the code paths 
> executed by all possible headers and methods.
>
> / Jonas
>
>
>
>
Received on Friday, 13 June 2008 16:08:23 UTC