Re: Review of 22 Jan editor's draft from Anne van Kesteren on 2008-01-27 (public-appformats@w3.org from January 2008)

From: Anne van Kesteren <annevk@opera.com>
Date: Sun, 27 Jan 2008 15:38:48 +0100
To: "Thomas Roessler" <tlr@w3.org>
Cc: "WAF WG (public)" <public-appformats@w3.org>
Message-ID: <op.t5leyyns64w2qv@annevk-t60.oslo.opera.com>
Many thanks for your comments! Some inline responses below with questions  
for you and the rest of the WG...


On Wed, 23 Jan 2008 19:21:10 +0100, Thomas Roessler <tlr@w3.org> wrote:
> My notes about last night's editor's draft...
>
> - The abstract is somewhat hard to understand.

I reworded it.


> - Introduction: I find much of the introduction chapter somewhat
>   disorganized.  I'd like the document to start out by saying rather
>   precisely what's going on, along these lines:

Thanks, I integrated large chunks of the suggested text.


> - Going on in the current introduction text, the specification
>   doesn't define an access control policy, but an access control
>   policy framework.  The use cases and requirements should move up
>   into the introduction, or at least close to it.

The introduction points to the use cases and requirements. I don't think  
they should go all the way up personally.


>   The Note about
>   form.submit() belongs somewhere into the design FAQ or security
>   considerations.

It was already in the FAQ so I removed the note.


>   The access control policy is *not* "defined in
>   the resource" (except for XML documents).

Reworded this somewhat.


>  "The client is trusted"
>   is an awfully broad statement.  The text "The resource would look at
>   follows" is followed by a snippet from an HTTP transaction.

Fixed, I think.


> - Conformance criteria: The document says awfully little in terms of
>   "a specification that wants to use this framework needs to do the
>   following things", even though section 2 claims so.

The specification has various requirements on hosting specifications. I  
made this more clear by introducing the term "hosting specification".


> - "User agents MAY optimize..." is besides the point.  Instead: "User
>   agents MAY employ any algorithm to implement this specification
>   that leads to the exact same results as the algorithms included in
>   this specification."

Fixed.


> - There's a lot of very detailed stuff about white space separated
>   lists going on in section 2; I'd rather see this dropped, and
>   grammars and useful language used closer to the parsing steps.

I rather not try to change this at this point at the risk of introducing  
errors.


> - The security considerations should ideally be a discussion of
>   security effects, i.e., "can trigger GET, and here's why this is
>   harmless"; "we care about POST, because".  Instead, there's a lot
>   of normative material clumped together in this section that would
>   better go to the places where actual processing is described.

I don't really see how. The actual security implications of the processing  
defined by this specification is defined in the processing sections  
already.


> - "Authors sharing content with domains that are on shared hosting
>   environments" rather misses the point by just talking about ports:
>   Namely, that -- because we assume that the protocol / technology
>   that hosts the access-control framework uses a same-origin policy
>   -- authorizations can only be given with a granularity of origins.
>   Anything below that is futile.

Origin includes ports (and schemes).


> - "evil" applications don't really have a place here, maybe talk
>   about an attacker.  "Authors SHOULD ensure that GET ..." is
>   re-stating HTTP; that should be rephrased as an admnishment to
>   adhere to the HTTP spec's semantics.

Fixed.


> - "Authors are encouraged to check the Referer-Root HTTP header" --
>   this should be somewhere in the processing model, not a side
>   remark in the security considerations.  It *is* an additional
>   policy enforcement point, and should be called out clearly.

The processing model is not for authors.


> - The design seems a bit inconsistent about IDNs: The syntax permits
>   them, but HTTP doesn'tl the latter is called out in a note.  I'd
>   rather see that done consistently.  When speaking about IDNs, it
>   might be useful to adapt the A-label and U-label terminology from
>   this I-D:
>  http://tools.ietf.org/html/draft-klensin-idnabis-issues-05

Do you have any concrete suggestions?


> - "If the scheme omitted it will match" is normative language, but
>   looks as though it's formatted as a note.  Or maybe I'm just
>   confused about the formatting.  Oh, and the grammar is wrong.

That was actually an informative statement, but I removed it as it was  
indeed confusing.


> - The Access-Control production continues to use comma-separated
>   method identifiers.  Also, shouldn't there be at least one method
>   given?

This was fixed some time ago. If "method" is present a method must be  
given. Otherwise not.


> - "In case resources on a domain are not in control..." mixes a use
>   case and processing rules into the middle of a syntax description,
>   and is generally quite a mess. Please make a pass through the
>   document to give it a useful structure.
>
> - '"allow rules" can be used to allow read access ...' sounds like a
>   remnant from the old voice browser spec.  At this point, I believe
>   tha the syntax description should limit itself to describing a
>   (multivalued) mapping from authorized origins to methods, with the
>   specific exception that GET is used to generically determine
>   access to the data returned, no matte what method was used to
>   retrieve these data.
>  (Incidentally, that's a point that is going to confuse policy
>   authors without end.  Maybe we need something different here.)

Both fixed.


> - '"method" rule' is oddly phrased.

I couldn't figure out where this was referring to.


> - 4.4 says what the syntax of the Referer-Root header is.  It would
>   be useful to point out here when that header is transmitted.  In
>   particular, "in case the Referer header is not included" makes it
>   sound as though user agents had a choice between these headers.

Fixed.


> - 5.1, cross-site access request.  The English grammar of the first
>   paragraph needs improvement.

Fixed, hopefully.


> - The processing model confuses user agent behavior and input that
>   is given to user agent behavior to be specified elsewhere.  That
>   doesn't make things particularly easy to read.

Tried to clarify things.


> - "The referrer root URI ..." assumes an HTTP-like URI syntax.
>   That's not necessarily present everywhere.  Needs clean-up!

I'm not sure what this comment is referring to. As far as I can tell  
referer root URI is defined as it should be.


> - Much of the processing model is phrased in terms of forward
>   references to generic steps.  I find this pseudo-code like
>   configuration style extremely hard to read, and suspect that it'll
>   make useful security review more difficult than necessary.

Generic steps were before the actual processing model at some point, but a  
set of commenters felt that forward referencing would be better. I'm not  
inclined to revert this change.


> - Why is the authorization request cache mandatory?

Is there a good reason to make it optional?


> - The authorization request cache isn't actually an authorization
>   request cache, but an authorization decision cache.  The current
>   name is confusing at least.

I think the current name is consistent with "authorization request" and I  
rather keep it that way.


> - There is no discussion as to how Vary or Cache-control headers on
>   HTTP responses that were received are handled.  How do these
>   interact with the separate caching model specified here

OPTIONS is not cached so those headers are not relevant.


> - Why does the specification follow redirects upon OPTIONS?  If I
>   read RFC 2616 correctly, then redirects for HTTP methods other
>   than GET and HEAD shouldn't happen without user intervention.

I suppose I could remove "transparent". What do other people think?


>   The current specification material around redirects looks like
>   it's pseudo-code ripped out of context; this needs more work to be
>   comprehensible, and a clear explanation what the expectations are
>   for a hosting specification.  Either the processing model or the
>   security considerations should explain very clearly what tradeoffs
>   a hosting specification faces in specifying any behavior
>   concerning redirects.

What do you have in mind?


> - The access control check algorithm goes to an excrutiating level
>   of detail, while confusing the reader.  It is probably much easier
>   to write up how to parse the various headers into the mapping from
>   origins to methods, and how to deal with that.

At least for me, it wasn't.


> - Once more, we have forward references to generic material,
>   undeclared variables used to pass around information between
>   different sections, and a general lack of readability.  For
>   example, temp method list isn't temporary, not introduced before
>   its first appearance, and only specified in the "allow list check"
>   section.

"temp method list" is introduced well in advance.


> - "parse ... using a streaming XML parser" -- I'm pretty sure you
>   don't mean to prescribe use of a streaming XML parser, but rather
>   want to allow use of one, right?

I do mean to prescribe the use of an XML parser. Otherwise results would  
be inconsistent with different clients picking different strategies.


> - In the allow list check, item 3 of the algorithm looks like it's
>   wrong.  This actually prunes the list of methods that are added to
>   the temp method list depending on the current request's method.

I don't think the specification is wrong here. Could you elaborate?


>   Also, this item has bad grammar.

Is this fixed by the changes from Mike?


> - Having atomic steps like "set the allow access flag to true"
>   (point 5) might be a useful technique in programming.  In English
>   text, it doesn't actually help understand the algorithm.

The algorithm requires careful reading, yes.


> - Starting at item 10 of the access item check algorithm, we go into
>   defining how domain names are parsed and compared.  That can be
>   said in much shorter terms by referring to terms from the relevant
>   specs. Roughly: Origin and item are converted to ASCII.  They are
>   compared string-insensitively, with the additional property that
>   the leftmost label of item might be "*", and can match an
>   arbitrary number of labels.  (Or something like this.)

I rather not make changes unless mistakes are identified at the risk of  
introducing errors.


> - The requirements tend to confuse authentication and authorization.
>   E.g., under 1, you're really talking about deployments that base
>   their authorization decisions exclusively on somebody being on the
>   right side of a firewall.

The requirements recently got changed. Is this still an issue?


> - In the part that talks about cross-site POST, it might be useful
>   to speak of UPNP as a possible target.

I'm not sure what you mean here.


> - "Should not fail to properly enforce security policy..." sounds
>   like a copy of requirement 13 later on.

Requirement 13 is reworded.


> - I continue to disagree with requirement 3, "must be deployable to
>   existing..."; this is highly dependent on the cricumstances of a
>   particular deployment.  I suggest saying clearly what is really
>   meant.  E.g., what abilities should be sufficient in order to
>   deploy the thing -- like, ability to write to XML files.

I'm not sure I understand.


> - requirement 4 (more "easily deploy" that I actually disagree with)
>   only holds for XML content.  Please qualify this requirement.

Requirement 4 is actually not about XML content specifically.


> - req 6 could use some elaboration. The current text could be
>   misread to say that the authorized party should be identified with
>   resource-level granularity, which we know is a bad idea.

This was reworded.


> - req 9 is somewhere between an implementation requirement and a use
>   case.  Strikes me as somewhat wierdly phrased..

This was reworded as well.


> - req 10 effectively says "APIs for cross-site data access
>   shouldn't differ from these for same-origin data access"; I'd
>   suggest changing to that

I changed this.


> - req 12 is badly worded.  I suspect it means "shouldn't break
>   HTTP".  If there's more to it, please express that more clearly.

Jonas?


-- 
Anne van Kesteren
<http://annevankesteren.nl/>
<http://www.opera.com/>
Received on Sunday, 27 January 2008 14:34:53 UTC