Re: More comments on access-control from Ian Hickson on 2007-11-19 (public-appformats@w3.org from November 2007)

From: Ian Hickson <ian@hixie.ch>
Date: Mon, 19 Nov 2007 22:55:26 +0000 (UTC)
To: Anne van Kesteren <annevk@opera.com>
Cc: "WAF WG (public)" <public-appformats@w3.org>
Message-ID: <Pine.LNX.4.62.0711192235380.3737@hixie.dreamhostps.com>
On Mon, 19 Nov 2007, Anne van Kesteren wrote:
> > 
> > The algorithm to "obtain the values from a space-separated list" mixes 
> > its tenses. It starts in the simple present ("must replace"), and then 
> > switches to the present progressive ("dropping ... and then 
> > chopping"). The way it is phrased doesn't technically define how you 
> > obtain values, it defines how you replace characters, which for some 
> > reason involves chopping the string.
> 
> Any idea how you're going to change 
> http://www.w3.org/TR/2007/CR-xbl-20070316/#attributes0 as that's pretty 
> much what text I'm reusing.

I don't plan to change the cited text -- it doesn't have the problem I 
mentioned (it's all in the present progressive). You didn't use the exact 
same text, the problem was introduced in your changes.


> > Why is the "*." bit redundant in the domain part? How do I make sure 
> > something matches "livejournal.com" but not 
> > "ianhickson.livejournal.com"?
> 
>   allow <livejournal.com> exclude <ianhickson.livejournal.com>
> 
> or more generic
> 
>   allow <livejournal.com> exclude <*.livejournal.com>

Hm. Ok. I'm pretty sure this is confusing enough that it'll be the source 
of security holes in future, though.

Does

   allow <*.livejournal.com> exclude <livejournal.com>

...exclude everything in livejournal.com? (It seems that it does.)


> > There are numerous hosts where the subdomain space isn't trusted but 
> > where the hostname itself is secure, and "example.com" doesn't at all 
> > convey that all subdomains are also trusted. I think we should require 
> > "*.example.com" to indicate that subdomains are trusted.
> 
> Writing
> 
>   allow <example.com> <*.example.com>
> 
> was expected to be the general case and therefore we previously decided 
> to go for
> 
>   allow <example.com>
> 
> to address that case. I'm not really comfortable with revisiting that 
> once again.

Fair enough.


> > 2.4. Referer-Root (sic) HTTP header: Do we need to continue 
> > misspelling this?
> 
> It seems more consistent with the existing header.

Sure, but it's inconsistent with document.referrer, rel=noreferrer, and 
the English word "referrer".


> > 3.1. Cross-site Access Request: Does the referrer root URI include the 
> > port even if it is the default port?
> 
> That's what the definition says, no?

I guess so. Why?


> > 3.1. Cross-site Access Request: what does "Specifications are strongly 
> > encouraged to define this in equivalent ways." mean?
> 
> I reworded this. The intention is that specifications base it on the 
> same "source" as much as possible.

I still don't really understand what that means I should do, as a spec 
editor.


> > 3.1.1. Generic Cross-site Access Request Algorithms: What does 
> > "transparently follow the redirect while observing the set of request 
> > rules" mean?
> 
> It somehow needs to point back to the algorithm that invoked it where 
> there is a list of "request rules" which define what to do in case of a 
> network error, redirect, etc.

I don't think the spec is clear enough on this point at the moment. I'm 
not sure I'd know what to do, as an implementor.


> > In fact in general you seem to overuse quote marks -- I recommend only 
> > using them for strings, quotes, euphemisms, and sarcasm, not for 
> > variables and literals.
> 
> If you have suggestions for what to use instead that would be welcome. 
> I'm often wondering what would be best to use in a particlar case.

Well, for variables I recommend <var>. For specific terms, <dfn> and <i>. 
For keywords, <b>. For normal terms, nothing.


> > 3.1.2. Cross-site GET Access Request: Why do you invent "current 
> > request URI"? It's just given the value of "request URI" and seems to 
> > only be used once, so why not just use "request URI"?
> 
> The idea is that "current request URI" is updated during a redirect and 
> "request URI" always points to the initial starting point. I suppose we 
> could just update "request URI" along the way. I wasn't sure if that 
> would be confusing or not.

Using current request URI is fine, but it needs to be clearer in the spec 
what's going on. e.g. have some note somewhere in the main algorithm 
pointing out the possible side-effects of the steps defined in the other 
section.


> > I'm assuming this is related to the "macro" steps in "3.1.1. Generic 
> > Cross-site Access Request Algorithms", but it isn't clear to me how 
> > this all works. For example, those refer to "origin" but I don't know 
> > what origin that is.
> 
> That's defined at the start of the algorithm that invokes it. It's the 
> referrer root URI.

The way the steps are factored out as actual steps that you substitute in, 
rather than as a separate algorithm that you invoke, is a little 
confusing, IMHO. I couldn't tell the "scope" of the "variables" (to use 
programming terms). I think I understand what the spec is trying to say 
now but it could probably be written more clearly (though I'm not sure 
exactly how). Sorry to be so vague in my complaints!


> > 3.1.3. Cross-site Non-GET Access Request: Again with the mention of 
> > "origin" -- whose origin? Where does it come from? It doesn't seem to 
> > be any of the arguments passed from the other spec.
> 
> It is defined at the start of the algorithm, no? "Let origin be the 
> referrer root URI."

Hm, must have missed that.


> > The way you have the "temp method list" defined, you don't cache as 
> > much as you should. Consider a resource with the following:
> > 
> >    <?access-control allow="example.com" method="POST"?>
> >    <?access-control allow="example.com" method="PUT"?>
> >    <?access-control allow="example.com" method="DELETE"?>
> > 
> > Now imagine you do a POST followed by a PUT, followed by another POST. 
> > Ideally, we should send a single GET, and then the POST, and then the 
> > PUT, and then the final POST, because we know the PUT will succeed. 
> > However, instead, we will send a GET, a POST, another GET, a PUT, and 
> > then a POST.
> 
> Actually, the idea *was* that the PUT would simply not be allowed. Error 
> flag to "fail" and "detail" to "network". I guess we should revisit 
> that. See below.

Yeah I don't think it would make sense for it to fail. You can just 
imagine someone doing:

   <?access-control allow="post.example.com all.example.com" method="POST"?>
   <?access-control allow="put.example.com all.example.com" method="PUT"?>

...or some other such weird thing and have them expect it to work. 
Certainly if it didn't work it'd be very confusing.


> > I believe we should cache all the methods that are allowed, not just 
> > the methods of the access-control item that was matched.
> 
> Ok, so the idea is to keep "looping" and adding methods to the list when 
> it's ok?

I guess. I'm not sure exactly how it would be implemented.


> > Incidentally, you should mention whether the authorization request 
> > cache can have multiple items with the same key. (It seems that it 
> > can.)
> 
> The idea is that you can't. When would this be possible?

In the example above, the way the algorithm is currently defined, it is 
possible for two items to be added with the same key (but with different 
expires headers and methods). The spec doesn't say what should happen if 
you add an item when there is already an item.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Monday, 19 November 2007 22:55:39 UTC