- From: Mark Nottingham <mnot@yahoo-inc.com>
- Date: Tue, 22 Jan 2008 14:56:52 +1100
- To: "WAF WG (public)" <public-appformats@w3.org>
I'm going to concentrate on substantive feedback here, avoiding editorial issues for the time being. I was looking at the November WD when I compiled this, but AFAICT all of these issues still apply to the latest ED. 1) The Method-Check protocol has potential for bad interactions with the HTTP caching model. Consider clients C and C', both using a caching intermediary I, and accessing resources on an origin server S. If C does not support this extension, it will send a normal GET request for a resource on S, whose response may be cached on I. S may choose to not send an Access- Control header for that response, since it wasn't "asked for." If C' does support this extension, it will retrieve the original response (intended for C) from I, even though it appended the Method-Check header to a request, and will be led to believe that the resource on S doesn't support cross-site requests. Three different solutions come to mind immediately; a) require all responses to carry an Access-Control directive, whether or not the request contained a Method-Check header, or b) require all responses to carry a Vary: Method-Check header, whether or not the request contained a Method-Check header, or c) remove the Method-Check request header from the protocol, and require an Access-Control directive in all GET responses. My preference would be (c), because... 2) The Method-Check header allows the client to specify a method to check for. What is the intent here? Is the server allowed (or encouraged) to tailor the content of the Access-Control header based upon its value? The use case for this header is not at all clear. 3) The Method-Check-Expires header creates a secondary expiration mechanism, separate from the HTTP caching model. I'm not convinced of its utility (are there convincing use cases where the access control metadata has a significantly different lifetime from the GET response?), doing so adds complexity to implementations, and the interactions with HTTP caching aren't defined (e.g., what if the response expires before the metadata does? Vice versa?). Also, it seems to assume clock sync between the server and the client, which has been proven to be a bad thing to do. Overall, this mechanism doesn't seem very well thought out, and I'd recommend its removal. 4) The Access-Control header's syntax uses an unescaped and unquoted comma as an internal delimiter, which would lead to headers like this; Access-Control: allow <example.com> method GET Access-Control: POST Access-Control: PUT, DELETE, deny <example.org> method POST Access-Control: GET Will clients be able to parse this correctly? Please don't repeat the mistakes of the Set-Cookie header; this is very bad practice. It would be better to leverage existing syntax from other headers; e.g., Access-Control: allow="example.com"; method="GET POST PUT DELETE", deny="example.org"; method="POST GET" 5) Non-GET access control traffic is much too chatty. If I have an application with a large number of resources, and cross-site non-GET traffic from a client needs to access, say, three of them, that will require an additional three HTTP requests just for access control. Some implementations will likely use different connections for the access control requests if the requests that follow are non- idempotent, further introducing latency (especially for users with limited network access or long hops). This will cause sites to boxcar messages and do other tricks to avoid extra roundtrips and the associated latency, and make this mechanism less attractive to want to model their services to take full advantage of HTTP ("REST APIs", as they're called in the new use cases section). I've raised this concern long ago <http://www.w3.org/mid/64FFD15E-3FE9-433A-9525-D596B4910451@yahoo-inc.com >, and haven't seen any substantive response. Separate from the server-side vs. client-side policy enforcement issue (which I'm not bringing up here explicitly, since it's an open issue AFAICT, although the WG doesn't link to its issues list from its home page), the Working Group needs to motivate the decision to have access control policy only apply on a per-resource basis, rather than per resource tree, or site-wide. One additional consequence of this decision is that access control policy for resources that accept non-GET requests will be effectively uncacheable (e.g. in proxies, as well as in user agents); the POST, etc. methods will invalidate any cached GET every time they come through. Overall, this approach doesn't seem well-integrated into the Web, or even friendly to it; it's more of a hack, which is puzzling, since it requires clients to change anyway. 6) As far as I can tell, this mechanism only allows access control on the granularity of an entire referring site; e.g., if I allow example.com to access a particular resource, *any* reference from example.com is allowed to access it. If that's the case, this limitation should be explicitly mentioned, and the spec should highlight the security implications of allowing multi-user hosts (e.g., HTML mail sites, picture sharing sites, social networking sites, "mashup" sites) to refer to your data. Also, section 4.1 contains "http://example.org/example" as a sample access item; at best this is misleading, and it doesn't appear to be allowed by the syntax either. That's all for now, -- Mark Nottingham mnot@yahoo-inc.com
Received on Tuesday, 22 January 2008 03:57:13 UTC