- From: Thomas Roessler <tlr@w3.org>
- Date: Wed, 23 Jan 2008 19:21:10 +0100
- To: public-appformats@w3.org
My notes about last night's editor's draft... - The abstract is somewhat hard to understand. Also, it mis-characterizes the document somewhat, as this is not about "enabling client-side cross-site requests", but about a mechanism by which web applications can communicate policies about cross-site requests and data sharing to other web applications. - Introduction: I find much of the introduction chapter somewhat disorganized. I'd like the document to start out by saying rather precisely what's going on, along these lines: Web application technologies commonly apply same-origin restrictions to network requests. These restrictions prevent a web application running from one origin from obtaining data retrieved from another origin, and also limit the amount of unsafe HTTP requests that can be automatically launched toward destinations that differ from the runnign application's origin. In Web application technologies that follow this pattern, network requests typically use ambient authentication and session management information, including HTTP authentication and cookie information. This specification extends this model in several ways: * Web applications are enabled to annotate the data that are returned in response to an HTTP request with a set of origins that should be permitted to read that information by way of the user's web browser. The policy expressed through this set of origins is enforced on the client. * Web user agents are enabled to discover whether a target resource is prepared to accept cross-site HTTP requests using unsafe methods (non-GET) from a set of origins. The policy expressed through this set of origins is enforced on the client. * Server side applications are enabled to discover that an HTTP request was deemed a cross-site request by the client web user agent, through the Referer-Root HTTP header. This extension enables server side applications to enforce limitations on the cross-site requests that they are willing to service. This specification is a building block for other specifications, which will define the precise model by which this specification is used. Examples for user specifications include extensions to XMLHttpRequest, XBL2, @@... - Going on in the current introduction text, the specification doesn't define an access control policy, but an access control policy framework. The use cases and requirements should move up into the introduction, or at least close to it. The Note about form.submit() belongs somewhere into the design FAQ or security considerations. The access control policy is *not* "defined in the resource" (except for XML documents). "The client is trusted" is an awfully broad statement. The text "The resource would look at follows" is followed by a snippet from an HTTP transaction. - Conformance criteria: The document says awfully little in terms of "a specification that wants to use this framework needs to do the following things", even though section 2 claims so. - "User agents MAY optimize..." is besides the point. Instead: "User agents MAY employ any algorithm to implement this specification that leads to the exact same results as the algorithms included in this specification." - There's a lot of very detailed stuff about white space separated lists going on in section 2; I'd rather see this dropped, and grammars and useful language used closer to the parsing steps. - The security considerations should ideally be a discussion of security effects, i.e., "can trigger GET, and here's why this is harmless"; "we care about POST, because". Instead, there's a lot of normative material clumped together in this section that would better go to the places where actual processing is described. - "Authors sharing content with domains that are on shared hosting environments" rather misses the point by just talking about ports: Namely, that -- because we assume that the protocol / technology that hosts the access-control framework uses a same-origin policy -- authorizations can only be given with a granularity of origins. Anything below that is futile. - "evil" applications don't really have a place here, maybe talk about an attacker. "Authors SHOULD ensure that GET ..." is re-stating HTTP; that should be rephrased as an admnishment to adhere to the HTTP spec's semantics. - "Authors are encouraged to check the Referer-Root HTTP header" -- this should be somewhere in the processing model, not a side remark in the security considerations. It *is* an additional policy enforcement point, and should be called out clearly. - The design seems a bit inconsistent about IDNs: The syntax permits them, but HTTP doesn'tl the latter is called out in a note. I'd rather see that done consistently. When speaking about IDNs, it might be useful to adapt the A-label and U-label terminology from this I-D: http://tools.ietf.org/html/draft-klensin-idnabis-issues-05 - "If the scheme omitted it will match" is normative language, but looks as though it's formatted as a note. Or maybe I'm just confused about the formatting. Oh, and the grammar is wrong. - The Access-Control production continues to use comma-separated method identifiers. Also, shouldn't there be at least one method given? - "In case resources on a domain are not in control..." mixes a use case and processing rules into the middle of a syntax description, and is generally quite a mess. Please make a pass through the document to give it a useful structure. - '"allow rules" can be used to allow read access ...' sounds like a remnant from the old voice browser spec. At this point, I believe tha the syntax description should limit itself to describing a (multivalued) mapping from authorized origins to methods, with the specific exception that GET is used to generically determine access to the data returned, no matte what method was used to retrieve these data. (Incidentally, that's a point that is going to confuse policy authors without end. Maybe we need something different here.) - '"method" rule' is oddly phrased. - 4.4 says what the syntax of the Referer-Root header is. It would be useful to point out here when that header is transmitted. In particular, "in case the Referer header is not included" makes it sound as though user agents had a choice between these headers. - 5.1, cross-site access request. The English grammar of the first paragraph needs improvement. - The processing model confuses user agent behavior and input that is given to user agent behavior to be specified elsewhere. That doesn't make things particularly easy to read. - "The referrer root URI ..." assumes an HTTP-like URI syntax. That's not necessarily present everywhere. Needs clean-up! - Much of the processing model is phrased in terms of forward references to generic steps. I find this pseudo-code like configuration style extremely hard to read, and suspect that it'll make useful security review more difficult than necessary. - Why is the authorization request cache mandatory? - The authorization request cache isn't actually an authorization request cache, but an authorization decision cache. The current name is confusing at least. - There is no discussion as to how Vary or Cache-control headers on HTTP responses that were received are handled. How do these interact with the separate caching model specified here - Why does the specification follow redirects upon OPTIONS? If I read RFC 2616 correctly, then redirects for HTTP methods other than GET and HEAD shouldn't happen without user intervention. The current specification material around redirects looks like it's pseudo-code ripped out of context; this needs more work to be comprehensible, and a clear explanation what the expectations are for a hosting specification. Either the processing model or the security considerations should explain very clearly what tradeoffs a hosting specification faces in specifying any behavior concerning redirects. - The access control check algorithm goes to an excrutiating level of detail, while confusing the reader. It is probably much easier to write up how to parse the various headers into the mapping from origins to methods, and how to deal with that. - Once more, we have forward references to generic material, undeclared variables used to pass around information between different sections, and a general lack of readability. For example, temp method list isn't temporary, not introduced before its first appearance, and only specified in the "allow list check" section. - "parse ... using a streaming XML parser" -- I'm pretty sure you don't mean to prescribe use of a streaming XML parser, but rather want to allow use of one, right? - In the allow list check, item 3 of the algorithm looks like it's wrong. This actually prunes the list of methods that are added to the temp method list depending on the current request's method. Also, this item has bad grammar. - Having atomic steps like "set the allow access flag to true" (point 5) might be a useful technique in programming. In English text, it doesn't actually help understand the algorithm. - Starting at item 10 of the access item check algorithm, we go into defining how domain names are parsed and compared. That can be said in much shorter terms by referring to terms from the relevant specs. Roughly: Origin and item are converted to ASCII. They are compared string-insensitively, with the additional property that the leftmost label of item might be "*", and can match an arbitrary number of labels. (Or something like this.) - The requirements tend to confuse authentication and authorization. E.g., under 1, you're really talking about deployments that base their authorization decisions exclusively on somebody being on the right side of a firewall. - In the part that talks about cross-site POST, it might be useful to speak of UPNP as a possible target. - "Should not fail to properly enforce security policy..." sounds like a copy of requirement 13 later on. - I continue to disagree with requirement 3, "must be deployable to existing..."; this is highly dependent on the cricumstances of a particular deployment. I suggest saying clearly what is really meant. E.g., what abilities should be sufficient in order to deploy the thing -- like, ability to write to XML files. - requirement 4 (more "easily deploy" that I actually disagree with) only holds for XML content. Please qualify this requirement. - req 6 could use some elaboration. The current text could be misread to say that the authorized party should be identified with resource-level granularity, which we know is a bad idea. - req 9 is somewhere between an implementation requirement and a use case. Strikes me as somewhat wierdly phrased.. - req 10 effectively says "APIs for cross-site data access shouldn't differ from these for same-origin data access"; I'd suggest changing to that - req 12 is badly worded. I suspect it means "shouldn't break HTTP". If there's more to it, please express that more clearly. Regards, -- Thomas Roessler, W3C <tlr@w3.org>
Received on Wednesday, 23 January 2008 18:27:00 UTC