Feedback on Access Control

I'm going to concentrate on substantive feedback here, avoiding  
editorial issues for the time being. I was looking at the November WD  
when I compiled this, but AFAICT all of these issues still apply to  
the latest ED.

1) The Method-Check protocol has potential for bad interactions with  
the HTTP caching model.

Consider clients C and C', both using a caching intermediary I, and  
accessing resources on an origin server S. If C does not support this  
extension, it will send a normal GET request for a resource on S,  
whose response may be cached on I. S may choose to not send an Access- 
Control header for that response, since it wasn't "asked for." If C'  
does support this extension, it will retrieve the original response  
(intended for C) from I, even though it appended the Method-Check  
header to a request, and will be led to believe that the resource on S  
doesn't support cross-site requests.

Three different solutions come to mind immediately;
   a) require all responses to carry an Access-Control directive,  
whether or not the request contained a Method-Check header, or
   b) require all responses to carry a Vary: Method-Check header,  
whether or not the request contained a Method-Check header, or
   c) remove the Method-Check request header from the protocol, and  
require an Access-Control directive in all GET responses.

My preference would be (c), because...

2) The Method-Check header allows the client to specify a method to  
check for. What is the intent here? Is the server allowed (or  
encouraged) to tailor the content of the Access-Control header based  
upon its value? The use case for this header is not at all clear.

3) The Method-Check-Expires header creates a secondary expiration  
mechanism, separate from the HTTP caching model. I'm not convinced of  
its utility (are there convincing use cases where the access control  
metadata has a significantly different lifetime from the GET  
response?), doing so adds complexity to implementations, and the  
interactions with HTTP caching aren't defined (e.g., what if the  
response expires before the metadata does? Vice versa?).

Also, it seems to assume clock sync between the server and the client,  
which has been proven to be a bad thing to do.

Overall, this mechanism doesn't seem very well thought out, and I'd  
recommend its removal.

4) The Access-Control header's syntax uses an unescaped and unquoted  
comma as an internal delimiter, which would lead to headers like this;

Access-Control: allow <example.com> method GET
Access-Control: POST
Access-Control: PUT, DELETE, deny <example.org> method POST
Access-Control: GET

Will clients be able to parse this correctly? Please don't repeat the  
mistakes of the Set-Cookie header; this is very bad practice. It would  
be better to leverage existing syntax from other headers; e.g.,

Access-Control: allow="example.com"; method="GET POST PUT DELETE",  
deny="example.org"; method="POST GET"

5) Non-GET access control traffic is much too chatty. If I have an  
application with a large number of resources, and cross-site non-GET  
traffic from a client needs to access, say, three of them, that will  
require an additional three HTTP requests just for access control.  
Some implementations will likely use different connections for the  
access control requests if the requests that follow are non- 
idempotent, further introducing latency (especially for users with  
limited network access or long hops).

This will cause sites to boxcar messages and do other tricks to avoid  
extra roundtrips and the associated latency, and make this mechanism  
less attractive to want to model their services to take full advantage  
of HTTP ("REST APIs", as they're called in the new use cases section).

I've raised this concern long ago <http://www.w3.org/mid/64FFD15E-3FE9-433A-9525-D596B4910451@yahoo-inc.com 
 >, and haven't seen any substantive response. Separate from the  
server-side vs. client-side policy enforcement issue (which I'm not  
bringing up here explicitly, since it's an open issue AFAICT, although  
the WG doesn't link to its issues list from its home page), the  
Working Group needs to motivate the decision to have access control  
policy only apply on a per-resource basis, rather than per resource  
tree, or site-wide.

One additional consequence of this decision is that access control  
policy for resources that accept non-GET requests will be effectively  
uncacheable (e.g. in proxies, as well as in user agents); the POST,  
etc. methods will invalidate any cached GET every time they come  
through.

Overall, this approach doesn't seem well-integrated into the Web, or  
even friendly to it; it's more of a hack, which is puzzling, since it  
requires clients to change anyway.

6) As far as I can tell, this mechanism only allows access control on  
the granularity of an entire referring site; e.g., if I allow  
example.com to access a particular resource, *any* reference from  
example.com is allowed to access it.

If that's the case, this limitation should be explicitly mentioned,  
and the spec should highlight the security implications of allowing  
multi-user hosts (e.g., HTML mail sites, picture sharing sites, social  
networking sites, "mashup" sites) to refer to your data.

Also, section 4.1 contains "http://example.org/example" as a sample  
access item; at best this is misleading, and it doesn't appear to be  
allowed by the syntax either.

That's all for now,

--
Mark Nottingham       mnot@yahoo-inc.com

Received on Tuesday, 22 January 2008 03:57:13 UTC