Feedback on Access Control from Mark Nottingham on 2008-01-22 (public-appformats@w3.org from January 2008)

From: Mark Nottingham <mnot@yahoo-inc.com>
Date: Tue, 22 Jan 2008 14:56:52 +1100
To: "WAF WG (public)" <public-appformats@w3.org>
Message-Id: <8CD9300B-9663-42AD-9E35-65CDF393780A@yahoo-inc.com>

I'm going to concentrate on substantive feedback here, avoiding
editorial issues for the time being. I was looking at the November WD
when I compiled this, but AFAICT all of these issues still apply to
the latest ED.

1) The Method-Check protocol has potential for bad interactions with
the HTTP caching model.

Consider clients C and C', both using a caching intermediary I, and
accessing resources on an origin server S. If C does not support this
extension, it will send a normal GET request for a resource on S,
whose response may be cached on I. S may choose to not send an Access-
Control header for that response, since it wasn't "asked for." If C'
does support this extension, it will retrieve the original response
(intended for C) from I, even though it appended the Method-Check
header to a request, and will be led to believe that the resource on S
doesn't support cross-site requests.

Three different solutions come to mind immediately;
a) require all responses to carry an Access-Control directive,
whether or not the request contained a Method-Check header, or
b) require all responses to carry a Vary: Method-Check header,
whether or not the request contained a Method-Check header, or
c) remove the Method-Check request header from the protocol, and
require an Access-Control directive in all GET responses.

My preference would be (c), because...

2) The Method-Check header allows the client to specify a method to
check for. What is the intent here? Is the server allowed (or
encouraged) to tailor the content of the Access-Control header based
upon its value? The use case for this header is not at all clear.

3) The Method-Check-Expires header creates a secondary expiration
mechanism, separate from the HTTP caching model. I'm not convinced of
its utility (are there convincing use cases where the access control
metadata has a significantly different lifetime from the GET
response?), doing so adds complexity to implementations, and the
interactions with HTTP caching aren't defined (e.g., what if the
response expires before the metadata does? Vice versa?).

Also, it seems to assume clock sync between the server and the client,
which has been proven to be a bad thing to do.

Overall, this mechanism doesn't seem very well thought out, and I'd
recommend its removal.

4) The Access-Control header's syntax uses an unescaped and unquoted
comma as an internal delimiter, which would lead to headers like this;

Access-Control: allow <example.com> method GET
Access-Control: POST
Access-Control: PUT, DELETE, deny <example.org> method POST
Access-Control: GET

Will clients be able to parse this correctly? Please don't repeat the
mistakes of the Set-Cookie header; this is very bad practice. It would
be better to leverage existing syntax from other headers; e.g.,

Access-Control: allow="example.com"; method="GET POST PUT DELETE",
deny="example.org"; method="POST GET"

5) Non-GET access control traffic is much too chatty. If I have an
application with a large number of resources, and cross-site non-GET
traffic from a client needs to access, say, three of them, that will
require an additional three HTTP requests just for access control.
Some implementations will likely use different connections for the
access control requests if the requests that follow are non-
idempotent, further introducing latency (especially for users with
limited network access or long hops).

This will cause sites to boxcar messages and do other tricks to avoid
extra roundtrips and the associated latency, and make this mechanism
less attractive to want to model their services to take full advantage
of HTTP ("REST APIs", as they're called in the new use cases section).

I've raised this concern long ago <http://www.w3.org/mid/64FFD15E-3FE9-433A-9525-D596B4910451@yahoo-inc.com
>, and haven't seen any substantive response. Separate from the
server-side vs. client-side policy enforcement issue (which I'm not
bringing up here explicitly, since it's an open issue AFAICT, although
the WG doesn't link to its issues list from its home page), the
Working Group needs to motivate the decision to have access control
policy only apply on a per-resource basis, rather than per resource
tree, or site-wide.

One additional consequence of this decision is that access control
policy for resources that accept non-GET requests will be effectively
uncacheable (e.g. in proxies, as well as in user agents); the POST,
etc. methods will invalidate any cached GET every time they come
through.

Overall, this approach doesn't seem well-integrated into the Web, or
even friendly to it; it's more of a hack, which is puzzling, since it
requires clients to change anyway.

6) As far as I can tell, this mechanism only allows access control on
the granularity of an entire referring site; e.g., if I allow
example.com to access a particular resource, *any* reference from
example.com is allowed to access it.

If that's the case, this limitation should be explicitly mentioned,
and the spec should highlight the security implications of allowing
multi-user hosts (e.g., HTML mail sites, picture sharing sites, social
networking sites, "mashup" sites) to refer to your data.

Also, section 4.1 contains "http://example.org/example" as a sample
access item; at best this is misleading, and it doesn't appear to be
allowed by the syntax either.

That's all for now,

--
Mark Nottingham mnot@yahoo-inc.com

Received on Tuesday, 22 January 2008 03:57:13 UTC