Re: Feedback on Access Control from Anne van Kesteren on 2008-01-22 (public-appformats@w3.org from January 2008)

From: Anne van Kesteren <annevk@opera.com>
Date: Tue, 22 Jan 2008 10:59:25 +0100
To: "Mark Nottingham" <mnot@yahoo-inc.com>, "WAF WG (public)" <public-appformats@w3.org>
Message-ID: <op.t5bspbpb64w2qv@annevk-t60.oslo.opera.com>
On Tue, 22 Jan 2008 04:56:52 +0100, Mark Nottingham <mnot@yahoo-inc.com>  
wrote:
> 1) The Method-Check protocol has potential for bad interactions with the  
> HTTP caching model.
>
> Consider clients C and C', both using a caching intermediary I, and  
> accessing resources on an origin server S. If C does not support this  
> extension, it will send a normal GET request for a resource on S, whose  
> response may be cached on I. S may choose to not send an Access-Control  
> header for that response, since it wasn't "asked for." If C' does  
> support this extension, it will retrieve the original response (intended  
> for C) from I, even though it appended the Method-Check header to a  
> request, and will be led to believe that the resource on S doesn't  
> support cross-site requests.
>
> Three different solutions come to mind immediately;
>    a) require all responses to carry an Access-Control directive,  
> whether or not the request contained a Method-Check header, or
>    b) require all responses to carry a Vary: Method-Check header,  
> whether or not the request contained a Method-Check header, or
>    c) remove the Method-Check request header from the protocol, and  
> require an Access-Control directive in all GET responses.
>
> My preference would be (c), because...

I don't understand this comment. The authorization request is the only  
request that uses the Method-Check HTTP header and that request uses the  
OPTIONS HTTP method.


> 2) The Method-Check header allows the client to specify a method to  
> check for. What is the intent here? Is the server allowed (or  
> encouraged) to tailor the content of the Access-Control header based  
> upon its value? The use case for this header is not at all clear.

It's additional information. It could in theory be dropped, but people  
requested to keep it as giving the server more information about what's  
going to happen can't hurt.


> 3) The Method-Check-Expires header creates a secondary expiration  
> mechanism, separate from the HTTP caching model. I'm not convinced of  
> its utility (are there convincing use cases where the access control  
> metadata has a significantly different lifetime from the GET response?),  
> doing so adds complexity to implementations, and the interactions with  
> HTTP caching aren't defined (e.g., what if the response expires before  
> the metadata does? Vice versa?).
>
> Also, it seems to assume clock sync between the server and the client,  
> which has been proven to be a bad thing to do.
>
> Overall, this mechanism doesn't seem very well thought out, and I'd  
> recommend its removal.

This cache is there to ensure that you don't have to do an authorization  
request again and again, etc. (Remember that authorization requests use  
the OPTIONS HTTP method.) It also uses the Referer-Root and request URI as  
key.


> 4) The Access-Control header's syntax uses an unescaped and unquoted  
> comma as an internal delimiter, which would lead to headers like this;
>
> Access-Control: allow <example.com> method GET
> Access-Control: POST
> Access-Control: PUT, DELETE, deny <example.org> method POST
> Access-Control: GET
>
> Will clients be able to parse this correctly? Please don't repeat the  
> mistakes of the Set-Cookie header; this is very bad practice. It would  
> be better to leverage existing syntax from other headers; e.g.,
>
> Access-Control: allow="example.com"; method="GET POST PUT DELETE",  
> deny="example.org"; method="POST GET"

Good point. Is the rest of the WG ok with changing this? Jonas?


> 5) Non-GET access control traffic is much too chatty. If I have an  
> application with a large number of resources, and cross-site non-GET  
> traffic from a client needs to access, say, three of them, that will  
> require an additional three HTTP requests just for access control. Some  
> implementations will likely use different connections for the access  
> control requests if the requests that follow are non-idempotent, further  
> introducing latency (especially for users with limited network access or  
> long hops).

This is exactly why the authorization request is cached...


> [...] Separate from the server-side vs. client-side policy enforcement  
> issue (which I'm not bringing up here explicitly, since it's an open  
> issue AFAICT, although the WG doesn't link to its issues list from its  
> home page), the Working Group needs to motivate the decision to have  
> access control policy only apply on a per-resource basis, rather than  
> per resource tree, or site-wide.

It's not an open issue.


> One additional consequence of this decision is that access control  
> policy for resources that accept non-GET requests will be effectively  
> uncacheable (e.g. in proxies, as well as in user agents); the POST, etc.  
> methods will invalidate any cached GET every time they come through.

GET is not used.


> Overall, this approach doesn't seem well-integrated into the Web, or  
> even friendly to it; it's more of a hack, which is puzzling, since it  
> requires clients to change anyway.

I don't really understand this. Changing clients is cheap compared to  
changing all the servers out there.


> 6) As far as I can tell, this mechanism only allows access control on  
> the granularity of an entire referring site; e.g., if I allow  
> example.com to access a particular resource, *any* reference from  
> example.com is allowed to access it.
>
> If that's the case, this limitation should be explicitly mentioned, and  
> the spec should highlight the security implications of allowing  
> multi-user hosts (e.g., HTML mail sites, picture sharing sites, social  
> networking sites, "mashup" sites) to refer to your data.
>
> Also, section 4.1 contains "http://example.org/example" as a sample  
> access item; at best this is misleading, and it doesn't appear to be  
> allowed by the syntax either.
>
> That's all for now,

Multi-user hosts already need filtering. Otherwise they could simply load  
a page from the same domain with a different path in an <iframe> or  
something and do the request from there. The security model of the Web is  
based around domains. How unfortunate or fortunate that may be.


-- 
Anne van Kesteren
<http://annevankesteren.nl/>
<http://www.opera.com/>
Received on Tuesday, 22 January 2008 09:56:12 UTC