- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 14 Nov 2007 19:14:03 +0000 (UTC)
- To: Anne van Kesteren <annevk@opera.com>
- Cc: "WAF WG (public)" <public-appformats@w3.org>
On Wed, 14 Nov 2007, Anne van Kesteren wrote:
>
> http://dev.w3.org/2006/waf/access-control/
1.1 has an example that reads:
Access-Control: <hello-world.invalid>
...which seems invalid.
"case-insensitive match" is defined poorly. If it is intended to _only_ be
about swapping a-z for A-Z, it should say so explicitly, not in
parenthesis. If it is about full Unicode mapping, then it should be stated
appropriately and the a-z part should be removed. Also, generally it is
better to lowercase and compare than uppercase and compare, since in full
Unicode cases the lowercase versions are more canonical iirc.
The algorithm to "obtain the values from a space-separated list" mixes its
tenses. It starts in the simple present ("must replace"), and then
switches to the present progressive ("dropping ... and then chopping").
The way it is phrased doesn't technically define how you obtain values, it
defines how you replace characters, which for some reason involves
chopping the string.
2.1 Access Item: "When the access item is used as part of the
Access-Control HTTP header authors must specify the result of applying the
ToASCII algorithm to the internationalized domain name as HTTP does not
support Unicode." still doesn't make sense to me. The requirement is that
the author provide a purely ASCII domain name, not that they take an IDN
and apply ToASCII, IMHO.
2.1 Access Item: Example "http://example.org:*" is said to be invalid but
as far as I can tell it is valid.
Why is the "*." bit redundant in the domain part? How do I make sure
something matches "livejournal.com" but not "ianhickson.livejournal.com"?
There are numerous hosts where the subdomain space isn't trusted but where
the hostname itself is secure, and "example.com" doesn't at all convey
that all subdomains are also trusted. I think we should require
"*.example.com" to indicate that subdomains are trusted.
It actually seems that even in the spec there is confusion about this, for
example there is this example:
Access-Control: allow <example.org> <*.example.org>
2.4. Referer-Root (sic) HTTP header: Do we need to continue misspelling
this?
3.1. Cross-site Access Request: "followed by the port (defaulting to the
default port for the scheme) of the resource" -- it makes no sense to
default the port in this case, since the resource had to have a port for
the request to have been made in the first place.
3.1. Cross-site Access Request: "of the resource from which the request
originated" -- is this true? Isn't it of the resource that the calling
spec wants used as the origin? e.g. in XHR I would imagine that the actual
URI used would be the origin, which (e.g. in the case of data: URIs) might
not match the resource's own URI at all. The next paragraph seems to agree
with me.
3.1. Cross-site Access Request: Does the referrer root URI include the
port even if it is the default port?
3.1. Cross-site Access Request: what does "Specifications are strongly
encouraged to define this in equivalent ways." mean?
3.1. Cross-site Access Request: "As this algorithm is used by other
specifications, those specifications must ensure to handle all return
values. Specifications may ignore "reason" if "error" is "true"." -- this
paragraph makes no sense at this point. What algorithm? What return
values? What are "reason" and "error"? I recommend, before this paragraph,
giving an overview of what the algorithms can return.
3.1.1. Generic Cross-site Access Request Algorithms: "are same-origin" is
not defined yet.
3.1.1. Generic Cross-site Access Request Algorithms: It's not clear which
algorithm "this algorithm" is. The "Generic Cross-site Access Request
Algorithms"? The "generic redirect steps"?
3.1.1. Generic Cross-site Access Request Algorithms: What does
"transparently follow the redirect while observing the set of request
rules" mean?
Tuples are denoted (like, this) not "like, this". (e.g. in 3.1.1. Generic
Cross-site Access Request Algorithms.) In fact in general you seem to
overuse quote marks -- I recommend only using them for strings, quotes,
euphemisms, and sarcasm, not for variables and literals.
3.1.2. Cross-site GET Access Request: "Perform an access check" isn't
defined yet nor hyperlinked. Same applies in "3.1.3. Cross-site Non-GET
Access Request".
3.1.2. Cross-site GET Access Request: Why do you invent "current request
URI"? It's just given the value of "request URI" and seems to only be used
once, so why not just use "request URI"? I'm assuming this is related to
the "macro" steps in "3.1.1. Generic Cross-site Access Request
Algorithms", but it isn't clear to me how this all works. For example,
those refer to "origin" but I don't know what origin that is.
3.1.3. Cross-site Non-GET Access Request: The first paragraph has the MUST
for the list of steps, but the second paragraph confuses matters by being
"in the way".
3.1.3. Cross-site Non-GET Access Request: What is the "target URI"?
3.1.3. Cross-site Non-GET Access Request: Again with the mention of
"origin" -- whose origin? Where does it come from? It doesn't seem to be
any of the arguments passed from the other spec.
"If there is a Method-Check-Expires HTTP response headers that can be
successfully parsed it must be honered." misspells "honored", but in any
case it doesn't define what honoring it means. It should probably say
instead that the entry must be removed once the current time exceeds the
time specified by the header, or some such. I assume how to parse the
header is defined somewhere?
3.2. Access Control Check: "The second subsection of this section" is
confusing. I couldn't tell if "this section" was section 3 or section 3.2,
and whethe the second subsection was 3.2, or 3.2.2. I'd just remove
paragraphs that tell you what you're about to read, frankly.
The way you have the "temp method list" defined, you don't cache as
much as you should. Consider a resource with the following:
<?access-control allow="example.com" method="POST"?>
<?access-control allow="example.com" method="PUT"?>
<?access-control allow="example.com" method="DELETE"?>
Now imagine you do a POST followed by a PUT, followed by another POST.
Ideally, we should send a single GET, and then the POST, and then the PUT,
and then the final POST, because we know the PUT will succeed. However,
instead, we will send a GET, a POST, another GET, a PUT, and then a POST.
I believe we should cache all the methods that are allowed, not just the
methods of the access-control item that was matched.
Incidentally, you should mention whether the authorization request cache
can have multiple items with the same key. (It seems that it can.)
The rules for processing access-control PIs will drop any PI with a
method="" pseudo-attribute at the moment. In fact the pseudo-attribute is
generally not supported by the algorithm as far as I can tell.
The rules for processing access-control PIs look like they won't drop PIs
with multiple pseudo-attributes of the same name other than exclude="".
e.g. <?access-control allow="example.com" allow="example.com"?> doesn't
get dropped by the current rules.
3.3. Access Item Check, step 1: This line is confusing. You are letting
the algorithm's parameters be overwritten by undefined variables. I think
you mean "let origin be..." and "let item be..." not the other way around.
3.3. Access Item Check, step 6: how can "origin" not have a scheme?
HTH,
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 14 November 2007 19:14:23 UTC