Re: [AC] Helping server admins not making mistakes from Thomas Roessler on 2008-06-11 (public-webapps@w3.org from April to June 2008)

From: Thomas Roessler <tlr@w3.org>
Date: Wed, 11 Jun 2008 12:48:04 +0200
To: Jonas Sicking <jonas@sicking.cc>
Cc: "WAF WG (public)" <public-appformats@w3.org>, public-webapps@w3.org
Message-ID: <20080611104804.GI860@iCoaster.does-not-exist.org>
On 2008-06-10 16:41:41 -0700, Jonas Sicking wrote:

>>> Getting access to a users cookie information is no small task. 
>>
>> I disagree.  There are any numbers of real-world scenarios in which
>> cookies are regularly leaked - JavaScript that's loaded from
>> untrusted sources, and captive portals are just two examples which
>> make people bleed cookies.  Basing the design here on the premise
>> that cookie-based authentication should somehow be enough to protect
>> against cross-site request forgery strikes me as unwise, in
>> particular when the cost is in additional complexity (and therefore
>> risk).
>
> Well, if you can get access to a users cookies and auth information then 
> nothing that we do here matters at all. Or at least matters to a much much 
> smaller extent. This whole spec is basically here precisely to protect the 
> information that is protected by cookies and auth headers (and for most 
> sites only cookies).

Ooops, I let your remark about "getting access to a users cookie
information" above lead me on a bogus path of argument.  That's what
I get for doing e-mail on a train. :-)

The fundamental point is that (a) GET request and (b) some POST
requests can be made with the user's cookies.  As far as web
application development is concerned, it's probably a really good
idea to consider the cross-site request horse (as opposed to the
cross-site information access horse) to have left the barn (even if
it hasn't left the barn totally).

Now, the working group evidently doesn't want to make that
assumption for generic POST requests (which is fine); that's why we
have the pre-flight check.  That's fine with me.

However, we're now getting to a point at which we try to prevent the
occurence of certain cross-site requests, specifically with cookies.
The cost here is complexity in the protocol design and
implementation (which is complexity to the deployer).

The benefit is questionable at best, in particular if you consider
POST: There are some POST requests that, under the proposed model,
couldn't be made using XMLHttpRequest, but could be made using
form.submit().  Do you really expect anybody to still understand
that?

When seeing questionable benefits on the one hand, and an
increasingly messy and incomprehensible design on the other, then
I'd rather see us not make this case.

>> I was getting at the fact that the recent change proposals are
>> moving away from scenarios in which the web application author
>> writes a policy in a simple language, and get more and more into a
>> scenario in which this policy is broken into pieces that are spread
>> across multiple headers.
>>
>> I think the sane options are either an extension to the policy
>> language (so that there is one model to look up), e.g.:
>>
>> 	allow method post with cookies with http-auth oauth from w3.org \
>> 	      except people.w3.org
>
> I'm less concerned about what exact syntax we use than what features we are 
> providing. As long as it is easy to understand.

I agree on that; the syntax that I put in up there is probably not
even consistent with what's in the draft.  It was meant as an
example only.

> That said, I do want to avoid the situation we had before where
> you had to specify in far too great detail your policy, such as
> for each allowed site separately list the allowed methods for
> that site.

Well, *that*, for one, isn't a syntactic question, as it implies
that certain policies couldn't be implementable without having to
use server-based techniques (enabled thanks to the origin header).

> Other than that I don't think it matters much, it's just a syntactic 
> question.
>
>> ... or a model where we do away with the notion of a policy language
>> entirely, and rely on the Access-Control-Origin (or whatever it's
>> called this week) and the server making a decision as our main
>> mechanism.  This goes back to Mark Nottingham's ideas from January,
>> when he suggested that using HTTP Vary headers and Referer-Root was
>> indeed enough.
>>
>> For an unsafe method, the server indicates that it's prepared to
>> deal with cross-site requests during the pre-flight phase, and then
>> makes its real decision at the time of the request, based on that
>> request's Access-Control-Origin.
>
> This is still relying on that the server is able to deal with the unsafe 
> request. I.e. the server is still forced to opt in to nothing, or opt in to 
> all possible combinations of http methods and headers.

Yes, that assumption is in there.

> So it seems no different from what the spec says today in that regard.

The difference is that we throw out the idea of having a complex
policy engine of any sort in the client.

>> For safe methods, the server can just use HTTP error codes.

>> While this is indeed a differnt model, it seems to be the one that a
>> lot of the discussion here is edging toward -- e.g., the defenses
>> against decoupling the preflight check from the "real" request all
>> rely on server-side enforcement of the policy.

>> What I'd really like to see us avoid is a scenario in which we're
>> creating a mess by mixing the various models to the extent that the
>> entire model becomes incomprehensible.

> I don't think the model is especially complicated for the server 
> administrator, even with the separate headers. It seems to me that the 
> number of headers is a poor measurement of the complexity of the spec.

I'm not even claiming that the number of headers is a measurement of
the complexity of the spec.  What I am, however, saying is that
having some things configurable per server, and others globally, is
a recipe for confusion (in particular to people who don't need to
touch their policies daily), and that a simpler model is preferable
over that.

>>> The smaller portions a server opts in to, the smaller the risk
>>> that they accidentally opts in to something where they
>>> accidentally shoot themselves in the foot.

>> While I sympathize with that notion, I think that the current
>> approach (mixing a policy language with headers that possibly need
>> to be set differently for different sites) is likely to mess up
>> things further and make analysis harder.

>> I do think that we server site authors best by making things simple,
>> easy and consistent.  That isn't the case with the model getting
>> ever more baroque.  Sorry.

> I guess it depends on what you define as "simple". I think of it as the 
> fewer things you have to keep track of the simpler it is. I think having to 
> keep track of the full set of http features for the server is more than 
> making sure to get one or two extra headers right.

If you look at information flows (and at your own arguments in
another branch of this very thread), then the current model indeed
requires you to deal with all kind of weird behavior, since the
ability of a script to exercise possibly arcane and unspecified
levers that might control what an HTTP response looks like will
translate into the ability of that script to perform cross-site
requests.  The more different headers and different code paths are
involved with that, the higher the risk that things actually go wrong.

A model that (a) simplifies the language, and (b) minimizes the code
paths that can be used to control the server's behavior would seem
the more sane one, and would seem to be the one that's easier to
control for server administrators, e.g. from filters that they can
put in front of their web application.

Cheers,
-- 
Thomas Roessler, W3C  <tlr@w3.org>
Received on Wednesday, 11 June 2008 10:50:43 UTC