Review of http://www.w3.org/TR/2007/WD-access-control-20071126/ from Williams, Stuart (HP Labs, Bristol) on 2007-12-10 (public-appformats@w3.org from December 2007)

From: Williams, Stuart (HP Labs, Bristol) <skw@hp.com>
Date: Mon, 10 Dec 2007 17:57:41 +0000
To: "public-appformats@w3.org" <public-appformats@w3.org>
Message-ID: <9674EA156DA93A4F855379AABDA4A5C60FBCF6BFBD@G5W0277.americas.hpqcorp.net>

I have an action from the TAG to review http://www.w3.org/TR/2007/WD-access-control-20071126/

Please regard the attached as personal comments. The TAG may subsequently choose to support some, all or none of them.

Best regards

Stuart Williams
--
Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England

General:
========
I think that the early part of the document (mostly the introduction) is written in a way that could be understood to suggest that resources rather than their representations are being retrieved.

I am a little sensitive to this from a webarch point of view. In webarch[1] we maintain a distinction between resources and their representations. What is retrieved is *never* the resource, but a representation of its current state. I would rather not regard what is executing in the browser as being the resource originally referenced. My sensitivity on this point is largely to avoid continued confusion in the community that resources themselves are retrieved in general.

At an architectural level we don't have a good term for the 'thing' that runs inside a browser. A browser renders a resource representation retrieved from a resource. This may entail the loading and execution of scripted behaviours. We don't have a good architectural name for these threads of behaviour. They are *not* the resource whose representation induced them into existence, nor are they that representation. I find it difficult to regard the cross-site situation as being one in which the resource whose representation is being rendered is an agent making access attempts to another resource. That said I can accept it as a shorthand-expression that avoid tortured language about it "...being client behaviour induced by access to the first resource which results in access attempts to a second resource."

[1] http://www.w3.org/TR/webarch

Editorial:
==========

1) Re: 1. Introduction

The introduction would benefit from a little more explaination of what a "cross-site" or "cross-domain" (pick juts one term) request is.

The opening sentence suggests that HTML img and script elements can result in "cross-site" requests. That leaves me puzzled, unless what it is intended to indicate is that img and script tags can result in the retrieval of scripts (in the case of IMG I assume through further references to scripts say from an SVG image) and the subsequent client site execution of those scripts can give rise to "cross-site" requests.

Suggest pre-pending (or wteo):

"A cross-site requests occurs when a retrieved resource representation results in the loading of scripted client behaviours which, during execution, request access resources in different domain from first resource."

2) Re: 4.3 <?access-control?> PI

The 2nd para has not been fully updated to cover the addition of the "method" pseudo attribute. eg. three->four and the value of a "method" is *not* an "access item".

3) Re: 5.1.1 Generic Cross-site Request Algorithm

Otherwise, let current request URI be the new URI and then
follow these set of steps:

...

2. Otherwise, transparently follow the redirect while
observing the set of request rules.

Suggest adding forward references to 5.1.2 and 5.1.3 on the phrase 'request rules' - I was initially confused about what was being referred to.

Substantive:
===========
1) Re: 5.1.3 Cross-site Non-Get Access Request - step 5 Otherwise Clause

In the case of PUT, POST, DELETE the network operation has already taken place - an "access control check" is a bit futile at this point, though it may expose that the access policy has changed. Seems a bit odd to force a fail in this situation, particularly if the network operation has actually succeeded.

2) Re Section 5 Processing Model

This section is very hard to read: partly because the algorithm has a very imperative style - and it would help to have an explicit statement of the intention of the algorithm (more below); partly because of the order in which elements of the algorithm are introduced eg. "5.2.1 Shared Algorithms" would be better understood if presented *after* "5.2.2 Access Control Check Algorithm"; partly due to the style of some parts of the algorithm and the use of flags to couple pieces of the algorithm - particularly the shared algorithms at 5.2.1 which have steps that say "...go to the next overall set of steps." or "Terminate the overall algorithm and... " which are first read with no sense of the overaal algorithm from which they are invoked.

On the intention of the algorithm: I intuit it to be the following, based on reading it's description:

1) For a given allow or deny access rule:
the set of allowed or denied request URIs is:
a) the union of all those URI which match one or more allow or deny pattern, 'minus'
b) the set of URIs that match one or more of any exclude pattern that is present.

2) Allow access is by method - for a given method the allow set is the union of all allow rules
which cite that method (ie. exclusions are localised to each rule) - arising from
either access-control headers or embedded access-control PIs.

3) Deny access is method independent: the overall set of denied request URIs is the union of all
such sets arising from either access-control headers or embedded access-control PIs.

4) Access denial takes precidence: if a request URI is present in both the overall deny and
the relevant method specifc allow sets, the access is denied.

5) Rule ordering and partitioning between http headers and embedded PIs is irrelvant to
the result of the algorithm

Note: the operation of the algorithm as described checks set membership in an intentional
way through pattern matching rather than in an extensional manner (by enumerating members).

I think this is a correct statement of the intent of the algorithm. If that is indeed the case it is a basis on which test cases may be specified.

Also, in large part is then serves as an expression of what the access control check is and ANY algorithm which satisfies those intentions would do - in fact in large part it oviated the need to articulate any particular algorithm in section 5.

Received on Monday, 10 December 2007 18:00:12 UTC