Re: Review of http://www.w3.org/TR/2007/WD-access-control-20071126/ from Brad Porter on 2008-01-09 (public-appformats@w3.org from January 2008)

From: Brad Porter <bwporter@yahoo.com>
Date: Wed, 9 Jan 2008 08:40:40 -0800 (PST)
To: Jon Ferraiolo <jferrai@us.ibm.com>, public-appformats@w3.org
Message-ID: <933491.36741.qm@web53507.mail.re2.yahoo.com>
(DISCLAIMER: To proactively answer Art's question to another poster earlier -- I am representing myself and myself only.  I am no longer a member of the working group or the W3C for that matter.  For the purposes of this discussion my comments are in no way associated with an employer or any other standards group.  In a past life I represented Tellme Networks, acquired by Microsoft, and was the original chair of the Task Force for Access Control, but I no longer represent Tellme or Microsoft in any way.  In the event there was any confusion, my @yahoo.com is a personal web-based email account and I am in no way affiliated with Yahoo.  Consider me an interested citizen.)

Here are a few comments:

==Process==

I'm not sure hammering on process does a lot to help us move forward.  Certainly this Task Force and Working Group has gone above and beyond in trying to be open, transparent, and seek input into the process and approach.  Later is better than never.

The process was different because the initiated based on previous work done in the Voice Browser working group that had existing implementation experience, best practices and a published NOTE.  Some of the use cases and requirements were presented for W3C Team review in April 2006:

http://www.w3.org/2006/04/27-access-control/Overview.html

The technique employed by the voice browser working group met a demonstrable market need and was already in use.  The idea of the Task Force was to get that onto W3C process track or risk letting the market define multiple incompatible implementations.  Sadly, the W3C process track doesn't work quickly enough to have prevented that in the JSON divergence.  

==Server vs Client==
 
> * Servers already have to worry about random requests 
> from random sites because it is easy for hackers to 
> manufacture whatever HTTP request they want to and send 
> it to whatever URL they want to 

This isn't the case in the corporate intranet.  Your desktop browser has access to web server resources that I as a hacker on the web don't.  

Again, it'd be nice if we all lived in a world where every server protected itself and every server lived on the public internet and no one used firewalls.  In as much as networks still employ border-protection instead of node-by-node protection and browsers still run inside those border-protected networks, this technique has a place.

I agree no one should be doing border-protection, but I don't know any company that isn't still doing border-based protection.  

==Meta-Data Required to Do Server Protection==

In order to do server-based protection of any sensitive data, the server needs verifiable credentials from the browser.  The server needs to know 

a) what type of request
b) who is the "trusted" requester

The type of request is necessary because different permissions may be allowed for "by reference" access to a resource and "read-access".  Today the browser sandbox allows hyperlinking cross-domain because the results of the hyperlink are shared only back to the viewer, not the referencing application.  The browser doesn't allow "read" access to resource cross-domain.

JSONRequest appears to be using mime-type as a way to imply request-type.  That seems like a dangerous overloading of functionality.

The "trusted" requester is either in the form of a REFERER header, user cookie, or digital certificate.  User cookie and Digital Certificate support a user->server trust relationship.  If the server doesn't want a per-user relationship, but wants to establish application->server relationship, there isn't a consistent way to do that, particularly because the REFERER header isn't deemed highly trust-worthy. 

Enabling application->server relationships for data-sharing is a helpful construct that will enable much richer applications.  

==Whitelisting/Blacklisting==

I disagree with the assertion that whitelisting/blacklisting aren't helpful or necessary.  Selective sharing of data is a desirable use-case for many existing sites.

Similarly supporting a combination of user->server and application->server relationships, which is enabled by supporting cookies passed with the request, enables sharing of user data.  This is helpful given much of the interesting data on the web is user-specific.

==Intent of the Access-Control Mechanism==

The intent of the task force was not to solve the general problems in browser sandboxing or web security or presentation-of-trust-relationship information.  A separate working group was formed to address at least a subset of those issues.  Those issues are still significant and need good solutions.  

This task force was intended to much more tactically focused.  The main objective again was to enable cross-domain read-access to web resources in a simple clean way that fit with existing mechanisms and didn't require rearchitecting the web or the browser sandbox from ground-up.

--Brad

Jon Ferraiolo <jferrai@us.ibm.com> wrote:  public-appformats-request@w3.org wrote on 01/09/2008 12:40:37 AM:
 
 > 
 > Jon Ferraiolo wrote:
 > > I guess that PEP = policy enforcement point
 > > 
 > > Regarding PEP in the client, I agree with David Baron that there has to 
 > > be a security policy in the client, and in fact there already is such a 
 > > policy (i.e., the same-domain policy). However, I also agree with David 
 > > Orchard that enforcement of which domains should be allowed to access 
 > > which data (i.e., the policy as expressed in Access Control's processing 
 > > instruction) more naturally belongs on the server, not the client.
 > > 
 > > But I would go further and question the whole approach of listing a set 
 > > of domains that are allowed or denied. I have a hard time seeing which 
 > > workflows make sense from whitelisting or blacklisting discrete domains, 
 > > with the possible exception of "*" (i.e., everyone) or a list of 
 > > subdomains within a company's intranet (although this would be fragile). 
 > > I much prefer the approach in JSONRequest, which does not include a list 
 > > of allowed/denied domains. The JSONRequest approach matches the 
 > > requirements of public/unprotected web services, which I believe is the 
 > > most important use case for cross-domain data access. JSONRequest 
 > > assumes that the server will decide who has access to the data, which 
 > > aligns with David O's recommendation. (But as I have said before, I 
 > > would like to see JSONRequest support XML payloads in addition to JSON 
 > > payloads.)
 > 
 > I don't think something like JSONRequest can ever satisfy the 
 > requirements access-control has.
 
 It's hard to say whether access-control meets its requirements since requirements weren't listed beforehand.
 
 > First off starts with a very different 
 > starting point security wise, 
 
 Without use cases and requirements it is hard to determine what the security strategy is for Access Control. There does seem to be a client-side PEP involved in Access Control, but some people on this list have said that it is better to have an approach where the server decides, not the client. (I agree with the server PEP approach versus the client.)
 
 > which is that anything that looks like 
 > javascript can be read by any site. This is due to the fact that 
 > browsers currently let you do cross-site loading of any data that is 
 > parseable as javascript.
 
 As you point out, many sites today support JSON-based web services which are can be sent from anywhere on the web. To me, one of the chief virtues of JSONRequest is that it is an incremental improvement over existing JSON-based workflows in use today. The fact that JSONRequest allows anyone to make a request to any site is OK with me today because:
 
 * That's already the case with today's JSON-based web services, where dynamic SCRIPT can access any site
 * It is better if the server decides who gets access to data, not the client 
 * Servers already have to worry about random requests from random sites because it is easy for hackers to manufacture whatever HTTP request they want to and send it to whatever URL they want to 
 
 > 
 > However access-control is trying to solve the problem for other data 
 > types as well, such as HTML and XML. 
 
 It looks straightforward to me to enhance JSONRequest to support XML (and thus XHTML) in addition to JSON. One way would be to say that if the payload is an JavaScript object, then it is JSON up/down the pipe, else if a string then it is XML. In either case, the browser must parse the response using an appropriate parser (i.e., JSON or XML) into JavaScript objects. If JSON is used for the data, then only JSON object literals are supported, not the full JavaScript language (that's what the JSON subset of JavaScript is about). On the XML side, the conformance requirement should be that the browser convert the response automatically into JavaScript objects to prevent web developers from doing an innerHTML assignment on content that might contain a SCRIPT tag. In either case, strict parsing is a conformance requirement and errors are generated if the data is not properly "well-formed".
 
 > These data types can currently not 
 > be loaded cross site, which means that there likely is documents out 
 > there containing sensitive information which must not be exposed cross 
 > site. And note that not including cookies and other auth credentials 
 > does not solve this problem due to content behind firewalls. While I can 
 > agree that relying solely on the fact that a server lives behind a 
 > firewall is not the best protection strategy, I expect this is currently 
 > in widespread use (it's something I've seen myself a lot) and so not 
 > something that we can ignore.
 > 
 > I would also like to note that JSONRequest also uses the clinent as a 
 > PEP. JSONRequest uses the Content-Type header to indicate that data 
 > should be accessible cross site and relies on the client to enforce 
 > that. 
 
 I wouldn't call that PEP. That's just a data integrity feature, not enforcement of who gets access to the data.
 
 > Access-control uses the <?access-control?> PI or the 
 > Access-Control header instead. Which of these two headers are more 
 > secure can of course be debated, but the fact remains that the client is 
 > a PEP in both cases.
 > 
 > I agree that the whitelisting/blacklisting feature does not add that 
 > much value compared to the simple "*" case. 
 
 If whitelisting/blacklisting does not add much value, then why do we have it at all?
 
 > However I do think 
 > especially whitelisting does add some value, and in fact the spec that 
 > access-control originated from had that as requirement. Also, the 
 > whitelisting/blacklisting in itself so far hasn't received much negative 
 > feedback, but rather the fact that the decision is made by the client.
 
 The lack of strong feedback on the whitelisting and blacklisting might be due to the fact that people have only started to focus on providing detailed feedback. Personally, even when roughly skimming early drafts, I have felt that the whitelisting and blacklisting approach was very fishy, but only since October have I started to focus on the spec in enough detail to feel that I can send feedback on the spec.
 
 Which leads me once again to hammer on the process by which the spec was developed. It is usually more efficient to develop a standard in a manner such that the community can provide feedback at the earlier possible point. One way to make this happen is to develop use cases and requirements upfront, announce them to the community, and then seek feedback. Then it is incumbent on the WG to perform due diligence of existing industry practice against those use cases and requirements to see if there already exists something that works. Failure to take these early steps results in the messy situation that is happening right now, where people push back on the specification at a late point in the specification timeline, challenging some of the key assumptions that represent the foundation of the spec.
 
 > 
 > I don't see how JSONRequest can be adjusted to work for any content type 
 > without turning it into something that looks a lot like access-control 
 > since it both relies on the content looking like javascript, and uses 
 > the Content-Type header to control access.
 
 See above from my proposal for allowing JSONRequest to support XML in addition to JSON. Also, the Content-Type header is a data integrity feature (i.e., ensure that the server supports the JSONRequest protocol) and not an access control mechanism.
 
 > 
 > / Jonas
 >
Received on Wednesday, 9 January 2008 16:41:11 UTC