Re: Feedback on Access Control from Jonas Sicking on 2008-01-23 (public-appformats@w3.org from January 2008)

From: Jonas Sicking <jonas@sicking.cc>
Date: Tue, 22 Jan 2008 16:23:03 -0800
To: Mark Nottingham <mnot@yahoo-inc.com>
CC: "WAF WG (public)" <public-appformats@w3.org>
Message-ID: <479688E7.3030208@sicking.cc>
Hi Mark,

Thanks for your feedback! Replies inline below, somewhat short due to 
Anne having covered much of it already.

Mark Nottingham wrote:
> I'm going to concentrate on substantive feedback here, avoiding 
> editorial issues for the time being. I was looking at the November WD 
> when I compiled this, but AFAICT all of these issues still apply to the 
> latest ED.
> 
> 1) The Method-Check protocol has potential for bad interactions with the 
> HTTP caching model.

I think this has been covered since we've switched to using OPTIONS, right?

> 2) The Method-Check header allows the client to specify a method to 
> check for. What is the intent here? Is the server allowed (or 
> encouraged) to tailor the content of the Access-Control header based 
> upon its value? The use case for this header is not at all clear.

I believe the idea was to allow enabling only certain methods. Before we 
had this there was only an all-or-nothing approach which seemed bad. 
I.e. in order to allow POST, you were also required to be able to deal 
with DELETE, PUT, etc.

> 3) The Method-Check-Expires header creates a secondary expiration 
> mechanism, separate from the HTTP caching model. I'm not convinced of 
> its utility (are there convincing use cases where the access control 
> metadata has a significantly different lifetime from the GET response?), 
> doing so adds complexity to implementations, and the interactions with 
> HTTP caching aren't defined (e.g., what if the response expires before 
> the metadata does? Vice versa?).
> 
> Also, it seems to assume clock sync between the server and the client, 
> which has been proven to be a bad thing to do.
> 
> Overall, this mechanism doesn't seem very well thought out, and I'd 
> recommend its removal.

I really think this is needed in some form or another. Without it we end 
up making cross site POSTs and other methods very chatty as you point 
out further down.

However the syncing thing does seem like a problem. How does normal http 
caching deal with this? Can we modify the format here to make it better? 
How about only allowing delta times to be expressed. Either requiring 
the time to be expressed in seconds;

Method-Check-Expires: 3600

would expire in an hour. Or something like

Method-Check-Expires: 1 hour 10 second

Or some such? The former seems simple and safe enough?

> 4) The Access-Control header's syntax uses an unescaped and unquoted 
> comma as an internal delimiter, which would lead to headers like this;
> 
> Access-Control: allow <example.com> method GET
> Access-Control: POST
> Access-Control: PUT, DELETE, deny <example.org> method POST
> Access-Control: GET
> 
> Will clients be able to parse this correctly? Please don't repeat the 
> mistakes of the Set-Cookie header; this is very bad practice. It would 
> be better to leverage existing syntax from other headers; e.g.,
> 
> Access-Control: allow="example.com"; method="GET POST PUT DELETE", 
> deny="example.org"; method="POST GET"

I haven't looked at this part recently, but annes suggestion seems good. 
Cookies were notoriously bad since the separator could actually appear 
in the values. I think as long as we provide clear and unambigious 
parsing rules browsers should have no problems with parsing.

> 5) Non-GET access control traffic is much too chatty. If I have an 
> application with a large number of resources, and cross-site non-GET 
> traffic from a client needs to access, say, three of them, that will 
> require an additional three HTTP requests just for access control. Some 
> implementations will likely use different connections for the access 
> control requests if the requests that follow are non-idempotent, further 
> introducing latency (especially for users with limited network access or 
> long hops).
> 
> This will cause sites to boxcar messages and do other tricks to avoid 
> extra roundtrips and the associated latency, and make this mechanism 
> less attractive to want to model their services to take full advantage 
> of HTTP ("REST APIs", as they're called in the new use cases section).
> 
> I've raised this concern long ago 
> <http://www.w3.org/mid/64FFD15E-3FE9-433A-9525-D596B4910451@yahoo-inc.com>, 
> and haven't seen any substantive response. Separate from the server-side 
> vs. client-side policy enforcement issue (which I'm not bringing up here 
> explicitly, since it's an open issue AFAICT, although the WG doesn't 
> link to its issues list from its home page), the Working Group needs to 
> motivate the decision to have access control policy only apply on a 
> per-resource basis, rather than per resource tree, or site-wide.
> 
> One additional consequence of this decision is that access control 
> policy for resources that accept non-GET requests will be effectively 
> uncacheable (e.g. in proxies, as well as in user agents); the POST, etc. 
> methods will invalidate any cached GET every time they come through.

Is there anything left here to address after having switched to OPTIONS 
if we can figure out a safe way to express the Method-Check-Expires header?

> Overall, this approach doesn't seem well-integrated into the Web, or 
> even friendly to it; it's more of a hack, which is puzzling, since it 
> requires clients to change anyway.

Any change will require changes to the browser, right? One goal of the 
design for access control was to avoid having to change both browsers 
*and* servers. See requirement 3.

> 6) As far as I can tell, this mechanism only allows access control on 
> the granularity of an entire referring site; e.g., if I allow 
> example.com to access a particular resource, *any* reference from 
> example.com is allowed to access it.
> 
> If that's the case, this limitation should be explicitly mentioned, and 
> the spec should highlight the security implications of allowing 
> multi-user hosts (e.g., HTML mail sites, picture sharing sites, social 
> networking sites, "mashup" sites) to refer to your data.

This is simply an unfortunate consequence of the way the same-origin 
policy works today. There is no way to allow only example.com/good.html 
to interact with the third party server since it is trivial for 
example.com/evil.html to open good.html and inject content into it such 
that it looks like the requests are coming from good.html.

I agree that this could be called out in the spec. Feel free to suggest 
suggest text that makes this clear.

Best Regards,
Jonas Sicking
Received on Wednesday, 23 January 2008 00:24:25 UTC