- From: Thomas Roessler <tlr@w3.org>
- Date: Fri, 8 May 2009 15:52:48 +0200
- To: Robin Berjon <robin@berjon.com>
- Cc: public-webapps WG <public-webapps@w3.org>
On 7 May 2009, at 13:47, Robin Berjon wrote: > Hi Thomas, > > On May 2, 2009, at 13:31 , Thomas Roessler wrote: >> 1. What does "access to network resources" mean? Does this refer >> to the use of inline resources, stylesheets, images, >> XMLHttpRequest, form submissions, some of these, all of these? >> More precisely, does this apply to (a) causing GET requests (inline >> resources, stylesheets, ...), (b) reading the results of GET >> requests (XHR), (c) causing POST requests (forms, XHR)? > > It is any access to any resource that requires a network connection, > irrespective of the type of resource, the operation, etc. I'm > clarifying. Following up on the discussion on yesterday's call, we have at least the following choices: 1. The HTML5 security model (as I'll call it by abuse of language) applies, and that includes access to inline resources. Choosing a random origin also means that XMLHttpRequest needs additional authorization; that authorization could be granted through an access element. 2. The HTML5 security model, but with additional restrictions on network access. In other words, *if* network access is permissible, then scripts and frames behave as they would in html5; if it isn't, external resources won't be loaded inline. > >> 2. The use of "URI" as an attribute name is misleading, since the >> value of that attribute is actually a pattern. > > We're switching to @pattern. > >> 3. The formal description of the attribute's value space is defined >> by reference to the valid URI token (or IRI token) productions in >> RFCs 3986 and 3987. Works for me (TM). >> >> Unfortunately, some additional considerations apply for IRI >> references: The mapping between arbitrary Unicode character >> sequences and A-labels ("xn--...") turns out to be sufficiently >> brittle that the only host name sequences you want to use are U- >> labels (the subset of non-ASCII labels for which ToUnicode and >> ToASCII round-trip). Comparison of IDNs is defined on the level of >> the A-label ("xn--"), and shouldn't occur on the Unicode level. >> Take a look at the latest POWDER drafts for another WG that recent >> grappled with the problem. Also, be clear what kinds of >> normalization is applied to the path and query string components >> before comparison. How do you deal with % encoding? (Again, see >> POWDER -- they're doing the right thing in their latest iteration.) > > I take it you're talking about POWDER Grouping? Is there a specific > section that you think we should find inspiration from (it hurts my > head a little...)? Would you recommend referencing it outright? Sorry for the obscure reference. Yes, I was talking about powder- grouping, but not the published Working Draft; I'll pony up a pointer. Meanwhile, the important pieces: - you want to % decode all unreserved characters (look up "unreserved" in RFC 3986) - you want to generate the ASCII version of the host part of the IRI reference (i.e., the xn--... version) - then, do an ASCII case insensitive comparison of the host part, and a character by character comparison of the rest Additional consideration in here: The above remarks work for the http and https URI schemes. They don't necessarily work for other schemes. What's the plan of the scope of the pattern attribute for: (a) additional schemes (b) between http and https? >> 4. How do you deal with trailing slashes? > > The path component is just a string — it has no structure. If if has > a trailing slash, then only access to paths that begin with that > path including its trailing slash is granted. ok > >> 5. What is the use case for the wildcard mechanism? As I noted >> before [*], the wildcard mechanism makes it fairly easy to scan >> large network segments by inventing host names on the fly. I'd >> prefer to simply drop that mechanism for the moment and keep things >> really simple for v1. If that's not an option, can we please >> define separate attribute names for patterns that imply access to >> the entire network and patterns that imply access to resources at a >> single host name only? > > The use case is many services (e.g. Google Maps) that serve from > unpredictable subdomains, like www17.example.com or > foo4.bar20.baz32.example.org. > > Is your proposal to have a separate attribute like > subdomains="true"? In some ways I see how it could be clearer, but I > don't really see how it changes the issue? That's not precisely my proposal, but it would address the concern.
Received on Friday, 8 May 2009 13:52:58 UTC