RE: Review of http://www.w3.org/TR/2007/WD-access-control-20071126/ from Williams, Stuart (HP Labs, Bristol) on 2007-12-12 (public-appformats@w3.org from December 2007)

From: Williams, Stuart (HP Labs, Bristol) <skw@hp.com>
Date: Wed, 12 Dec 2007 11:21:23 +0000
To: Anne van Kesteren <annevk@opera.com>, "public-appformats@w3.org" <public-appformats@w3.org>
Message-ID: <9674EA156DA93A4F855379AABDA4A5C60FBCFCDE70@G5W0277.americas.hpqcorp.net>
Hello Anne,

Clarifications and responses to questions in-line below:

> -----Original Message-----
> From: Anne van Kesteren [mailto:annevk@opera.com]
> Sent: 11 December 2007 17:26
> To: Williams, Stuart (HP Labs, Bristol); public-appformats@w3.org
> Subject: Re: Review of
> http://www.w3.org/TR/2007/WD-access-control-20071126/
>
> On Mon, 10 Dec 2007 18:57:41 +0100, Williams, Stuart (HP
> Labs, Bristol)
> <skw@hp.com> wrote:
> > I have an action from the TAG to review
> > http://www.w3.org/TR/2007/WD-access-control-20071126/
> >
> > Please regard the attached as personal comments. The TAG may
> > subsequently choose to support some, all or none of them.
>
> Please regard the responses below as responses of the WAF WG.
> The WG may override my responses in subsequent discussion. (:-), though
> for real.)
> There are some questions included as well.
>
>
> > I think that the early part of the document (mostly the introduction) is
> > written in a way that could be understood to suggest that resources
> > rather than their representations are being retrieved.
> >
> > [... about http://www.w3.org/TR/webarch ...]
>
> I tried making it a more clear by talking about "Web page" and "image" and
> "data of a resource" which seems more in line with AWWW while remaining
> relatively easy to read and understandable to myself. :-) I hope that
> helps.

I'll review again in due course.

A trouble with terms like "web page" and "image" when one is trying to speak with some precision is that, at least for a "web page" there are three things that could be being referred to:

1) What is rendered to the user on the screen - a visual presentation of what was transferred over the net.
2) The bits transferred (an HTML serialisation) - a webarch:Representation.
3) The web page in an abstract sense, a webarch:Resource, which could have multiple representations (of the same page - .PS, .PDF, .HTML....).

It is easy to write narrative where it is not clear whether one is speaking of a "web page" in with as webarch:Representation kind of a thing, or a webarch:Resource kind of a thing and it is easy to slide between the two senses. /me has done it.

> > The introduction would benefit from a little more explaination of what a
> > "cross-site" or "cross-domain" (pick juts one term) request is.
>
> I added such an explanation. I also added it to the abstract.

OK... thanks... I'll review in due course.

> > The opening sentence suggests that HTML img and script elements can
> > result in "cross-site" requests. That leaves me puzzled, unless what it
> > is intended to indicate is that img and script tags can result in the
> > retrieval of scripts (in the case of IMG I assume through further
> > references to scripts say from an SVG image) and the subsequent client
> > site execution of those scripts can give rise to "cross-site" requests.
>
> The idea is that a Web page on domain A can use an image on domain B by
> means of <img>. The draft now clearly states this.

Ok... though I guess that has always been the case with the web - eg. Norm regularly uses flickr hosted images in his blog pages.

I wasn't aware of there being enforced cross-site restrictions in such usages or that they are a particlar design centre for this work.

> > Suggest pre-pending (or wteo):
> >
> > "A cross-site requests occurs when a retrieved resource representation
> > results in the loading of scripted client behaviours which, during
> > execution, request access resources in different domain from first
> > resource."
>
> Cross-site requests in the draft are not restricted to scripting. For
> instance, attaching cross-site XBL bindings does not have to involve
> scripting.

I'm afraid I am largely ignorant of XBL :-(, but I take its as away of binding things (including behaviours) to the presentation of a document/web-page.

I don't understand how a client would discriminate between references that it would subject to access controls and those that it would not.

> > 2) Re: 4.3 <?access-control?> PI
> >
> > The 2nd para has not been fully updated to cover the addition of the
> > "method" pseudo attribute. eg. three->four and the value of a "method"
> > is *not* an "access item".
>
> Actually, it is. It talks about three attributes of type X and one
> attribute of type Y.

Mostly my bad re counting (doh!), though:

        "If an attribute is specified it must at least contain an access item."

could be a little tighter since the "method" pseudo attribute does not convey an "access item".

> > 3) Re: 5.1.1 Generic Cross-site Request Algorithm
> >
> >         Otherwise, let current request URI be the new URI and then
> >         follow these set of steps:
> >
> >         ...
> >
> >         2. Otherwise, transparently follow the redirect while
> >          observing the set of request rules.
> >
> > Suggest adding forward references to 5.1.2 and 5.1.3 on the phrase
> > 'request rules' - I was initially confused about what was being referred
> > to.
>
> Would moving this down help? Similarly to your suggestion for the
> processing model algorithm?

I think so... ie 'unfolding' algorithm from it's top-level down it's leaves.

> > Substantive:
> > ===========
> > 1) Re: 5.1.3 Cross-site Non-Get Access Request - step 5
> Otherwise Clause
>
> Step 5 is not an otherwise clause. I'll assume that was a mistake.

I was referring to the "Otherwise" clause at the *end* of step 5. The para immediately before the section 5.2 heading:

"Otherwise
        Perform an access control check using the request method as method. If it returns "fail"
        remove the cache entry, then terminate this algorithm, and return with the status flag
        set to "network". Otherwise, if it returns "pass", terminate this algorithm and return
        with the status flag set to "success". Do not actually terminate the request."

> > In the case of PUT, POST, DELETE the network operation has already taken
> > place - an "access control check" is a bit futile at this point, though
> > it may expose that the access policy has changed. Seems a bit odd to
> > force a fail in this situation, particularly if the network operation
> > has actually succeeded.
>
> The access policy protects the information that is part of the response.
> The request itself is protected by the authorization request that is made
> before this one (or the authorization request cache).

FWIW I think I'd agree that except around the the time of a change in access policy the algorithm one should never reach this point. The authorization check should prevent the networked operation happening in the first place.

> > 2) Re Section 5 Processing Model
> >
> > This section is very hard to read: partly because the algorithm has a
> > very imperative style - and it would help to have an explicit statement
> > of the intention of the algorithm (more below); partly because of the
> > order in which elements of the algorithm are introduced eg. "5.2.1
> > Shared Algorithms" would be better understood if presented *after*
> > "5.2.2 Access Control Check Algorithm"; partly due to the style of some
> > parts of the algorithm and the use of flags to couple pieces of the
> > algorithm - particularly the shared algorithms at 5.2.1 which have steps
> > that say "...go to the next overall set of steps." or "Terminate the
> > overall algorithm and... " which are first read with no sense of the
> > overaal algorithm from which they are invoked.
>
> Yes, I agree it is kind of tricky. I can move the section around if that
> would make it better and maybe include your steps below as an informative
> guide to the algorithm. Would that help?

I think so... the numbered 'statement' that I offered where (my) attempt to
capture what the algorithm is designed to  accomplish - ie. they are intended (roughly)
as a declarative (though possibly flawed) expression of what the algorithm is
supposed to do rather than a set of procedures to follow - ie. they are intend as
factual or truthful assertions about the algorithm, not an imperative process.

Personnally, I think that such an expression (corrected if necessary to accurately
capture the design intent) is a powerful tool. It can provide a basis on which to
create test cases and to judge whether access should/should not be denied so that
the algorithm itself as well as it's implementation can be road-tested. In fact,
without such an expression there really is no way to judge whether the algorithm
does what is required of it (because it isn't stated).

> > On the intention of the algorithm: I intuit it to be the following,
> > based on reading it's description:
> >
> > 1) For a given allow or deny access rule:
> >          the set of allowed or denied request URIs is:
> >                  a) the union of all those URI which match
> >                     one or more allow or deny pattern, 'minus'
> >                  b) the set of URIs that match one or more of any
> >                     exclude pattern that is present.
>
> Yes.
>
>
> > 2) Allow access is by method - for a given method the allow
> >    set is the union of all allow rules
> >    which cite that method (ie. exclusions are localised to
> >    each rule) - arising from either access-control headers
> >    or embedded access-control PIs.
>
> Yes.
>
>
> > 3) Deny access is method independent: the overall set of denied request
> >    URIs is the union of all such sets arising from either access-control
> >    headers or embedded access-control PIs.
>
> Yes.
>
>
> > 4) Access denial takes precidence: if a request URI is present in both
> >    the overall deny and the relevant method specifc allow sets, the access
> >    is denied.
>
> Yes.
>
>
> > 5) Rule ordering and partitioning between http headers and embedded PIs
> >    is irrelvant to the result of the algorithm
>
> For XML documents, yes.
>
>
> > Note: the operation of the algorithm as described checks set membership
> > in an intentionalway through pattern matching rather than in an extensional
> > manner (by  enumerating members).
>
> I'm sorry, I don't quite follow this comment.

Ok... it was only a note wrt to my expression of the algorithms intention.

Consider the set whose members are even positive integers less than 20. I can express that set in two ways:

        Intentional: {x | x<20 ^ even(x)}
        Extensional: { 2, 4, 6, 8, 10, 12, 14, 16 ,18 }

Wrt to the above: the tone of what I wrote feels like I am thinking of the sets of URI in an extensional away - that at some stage the sets are enumerated, whereas that is not if fact the case. The patterns in access items are an intensional mechanism in that they provide a means to test for membership rather than a mean to enumerate them all members.

It's probably not an important note... particularly if it adds confusion to any explaination that you could choose to usefully include.

> > I think this is a correct statement of the intent of the algorithm. If
> > that is indeed the case it is a basis on which test cases may be
> > specified.
> >
> > Also, in large part is then serves as an expression of what the access
> > control check is and ANY algorithm which satisfies those intentions
> > would do - in fact in large part it oviated the need to articulate any
> > particular algorithm in section 5.
>
> --
> Anne van Kesteren
> <http://annevankesteren.nl/>
> <http://www.opera.com/>

Thanks,

Stuart Williams
--
Received on Wednesday, 12 December 2007 11:27:19 UTC