Comments on Access Control (http://www.w3.org/TR/access-control/) from Marc Silbey on 2007-03-21 (public-appformats@w3.org from March 2007)

From: Marc Silbey <marcsil@windows.microsoft.com>
Date: Wed, 21 Mar 2007 11:22:21 -0700
To: <public-appformats@w3.org>
Message-ID: <8F47C94301074A4DA5A88E364A1E673A049DA953@WIN-MSG-20.wingroup.windeploy.ntdev.mi>
Hi all,

As promised I've included our comments and questions below on Access
Control. The draft looks very good and the below comments are minor. I'm
looking forward to discussing these in further detail.

	1. Introduction
COMMENT 1) Maybe change "For security reasons, web browsers typically do
not permit a website..." to "For security reasons, web browsers
typically do not permit client scripting code running on one website..."

COMMENT 2) Maybe change "The access-control mechanism enables web
resources to permit websites to access their content" to "The
access-control mechanism enables web resources to permit scripts running
from other domains to access their content"

	1.1 Background
COMMENT 3) Correction for our agreed scope: change "The access-control
header allows an XML data document to..." to "The access-control header
allows a web resource to" and change "XML document" to "resource" in "By
specifying an access control header that "allows" example.com to read,
that particular XML document"

	1.1.1 Definition of Read Access to Web Resources
COMMENT 4) typo remove "s" in "resources"

COMMENT 5) Correction for scope: change "XML document" to "resource" for
"A request made by an application to load a web resource in a manner
that allows the application to inspect the contents of that resource"

 	1.2. Conformance Criteria
COMMENT 6) typo remove "over" from "...written with more concern for
clarity than efficiency"

	1.3. Security Considerations
COMMENT 7) "User agents which implement this capability should take care
not to expose other trusted data (cookies, HTTP header data)
inappropriately" - we should probably provide some scenarios that we're
trying to protect so readers can easily understand this

COMMENT 8) It maybe more clear to say "Authors should take care to
protect against exposing themselves to cross-site scripting attacks by
rendering or executing the retrieved content directly without
validation."

2. Access Control Read Policy
COMMENT 9) We should define what extra safety measures are required for
HTTP methods besides HEAD and GET. We should think again about adding
POST because some folks will argue that it is as safe as GET and would
be a useful addition.

QUESTION: What happens in the case of trailing headers? Maybe we should
specify that this appears in the headers that come before the body

COMMENT 10) Proposed rewording "When access to a resource is not
permitted by this policy, the request is said to be in error and access
to that resource MUST be denied in such a way that the status or
existence of the blocked resource is not revealed to the caller (to
prevent enumeration/fingerprinting attacks)."

COMMENT 11) Proposed rewording "Resources to which the access control
read policy applies have an associated unordered list (which can be
empty) of access control rules. There are allow and block lists. An
access control rule consists of an allow ruleset and optionally a deny
ruleset to handle exceptions. Each of these rulesets is an unordered
list of access items. How each access control rule is matched against
the request URL to determine whether access to the resource is to be
granted is described in the next section.

COMMENT 12) Proposed change to EBNF:

An access item MUST match the following EBNF: 
access-item    	::= scheme-specifier "://" domain-pattern ( ":"
port-specifier )? | "*"
domain-pattern 	::= wildcard-label | wildcard-label "." domain
wildcard-label 	::= label | "*"
scheme-specifier	::= scheme | "*"
port-specifier	::= port | "*"

	We're concerned that allowing "example.*" wildcarding maybe
unnecessarily flexible and lead to mistakes by web developers

COMMENT 13) Proposed rewording "In addition to matching the above EBNF,
the ToASCII algorithm MUST apply successfully (without errors) to each
label component from the access item. If the access item doesn't match
the EBNF or the ToASCII algorithm fails, the request is denied."

COMMENT 14) Proposing removal of the following examples following the
above comments on wildcards
	https://*.*:80
	*://example.org
	http://example.org:*

	2.1. Content-Access-Control header
QUESTION) Does this plus symbol mean that there are always two rules
defined? "ruleset ::= rule (LWS? "," LWS? rule)+"

COMMENT 15) Proposed rewording: "If the Content-Access-Control header
doesn't match the specified syntax, the request is denied." If we decide
to go with "deny" instead of "except" there are other replacements.
Similarly we should think about changing "resource is in error" to
"request is denied" 

	3. Matching Algorithm
COMMENT 16) Maybe add "It should be observed that the DENY rules take
precedence over any ALLOW rules." after the first algorithm. We should
think about joining the allow and deny rulesets so the operate on the
full list together.

COMMENT 17) Proposed changes to the second algorithm to help clarify

1.	Let request URL be origin and access item be rule. 
2.	If item is a single U+002A (*) there is a match. Abort this
algorithm. 
3.	Drop the path, query, and fragment part in origin so that it
matches the access item production. 
4.	Count the U+002E (.) characters in both origin and item. If the
results are not equal, there is no match;  abort this algorithm. 
5.	Compare the scheme from origin and item. If there's a match,
drop the scheme from both including the :// sequence following it.
Otherwise, abort this algorithm. 
6.	Compare the port from origin and item. If either of them doesn't
have the port explicitly specified use the default port for the scheme.
If there's a match, drop the port from both including the U+003A (:)
preceeding it. Otherwise, abort this algorithm. 
7.	Split origin and item on the U+002E (.) character and preserve
the order of new set of LabelItems. In case there's no U+002E character,
each set will have exactly one LabelItem. Now for each set of LabelItems
(one from origin and one from item):
	1.	Let the LabelItem from origin be OriginLabel and the
item from item be RuleLabel. 
	2.	If RuleLabel is a single U+002A (*) character, then
there is a match. Perform this sub algorithm again for the next set of
LabelItems or abort this sub algorithm if there's no next set of
LabelItems. 
	3.	Apply the ToASCII algorithm to OriginLabel and
CompareLabel.
	4.	Compare OriginLabel and CompareLabel. If there's a
match, do this sub algorithm again for the next set of LabelItems or
abort this sub algorithm if there's no next set of LabelItems.
Otherwise, abort this algorithm. 
8.	There's a match. Abort this algorithm.

Many thanks to the IE folks that reviewed the spec and a special thanks
to Eric Lawrence, one of our Networking gurus, for helping provide
detailed comments
Received on Wednesday, 21 March 2007 18:24:18 UTC