- From: Thomas Roessler <tlr@w3.org>
- Date: Fri, 8 May 2009 15:52:48 +0200
- To: Robin Berjon <robin@berjon.com>
- Cc: public-webapps WG <public-webapps@w3.org>
On 7 May 2009, at 13:47, Robin Berjon wrote:
> Hi Thomas,
>
> On May 2, 2009, at 13:31 , Thomas Roessler wrote:
>> 1. What does "access to network resources" mean? Does this refer
>> to the use of inline resources, stylesheets, images,
>> XMLHttpRequest, form submissions, some of these, all of these?
>> More precisely, does this apply to (a) causing GET requests (inline
>> resources, stylesheets, ...), (b) reading the results of GET
>> requests (XHR), (c) causing POST requests (forms, XHR)?
>
> It is any access to any resource that requires a network connection,
> irrespective of the type of resource, the operation, etc. I'm
> clarifying.
Following up on the discussion on yesterday's call, we have at least
the following choices:
1. The HTML5 security model (as I'll call it by abuse of language)
applies, and that includes access to inline resources. Choosing a
random origin also means that XMLHttpRequest needs additional
authorization; that authorization could be granted through an access
element.
2. The HTML5 security model, but with additional restrictions on
network access. In other words, *if* network access is permissible,
then scripts and frames behave as they would in html5; if it isn't,
external resources won't be loaded inline.
>
>> 2. The use of "URI" as an attribute name is misleading, since the
>> value of that attribute is actually a pattern.
>
> We're switching to @pattern.
>
>> 3. The formal description of the attribute's value space is defined
>> by reference to the valid URI token (or IRI token) productions in
>> RFCs 3986 and 3987. Works for me (TM).
>>
>> Unfortunately, some additional considerations apply for IRI
>> references: The mapping between arbitrary Unicode character
>> sequences and A-labels ("xn--...") turns out to be sufficiently
>> brittle that the only host name sequences you want to use are U-
>> labels (the subset of non-ASCII labels for which ToUnicode and
>> ToASCII round-trip). Comparison of IDNs is defined on the level of
>> the A-label ("xn--"), and shouldn't occur on the Unicode level.
>> Take a look at the latest POWDER drafts for another WG that recent
>> grappled with the problem. Also, be clear what kinds of
>> normalization is applied to the path and query string components
>> before comparison. How do you deal with % encoding? (Again, see
>> POWDER -- they're doing the right thing in their latest iteration.)
>
> I take it you're talking about POWDER Grouping? Is there a specific
> section that you think we should find inspiration from (it hurts my
> head a little...)? Would you recommend referencing it outright?
Sorry for the obscure reference. Yes, I was talking about powder-
grouping, but not the published Working Draft; I'll pony up a pointer.
Meanwhile, the important pieces:
- you want to % decode all unreserved characters (look up "unreserved"
in RFC 3986)
- you want to generate the ASCII version of the host part of the IRI
reference (i.e., the xn--... version)
- then, do an ASCII case insensitive comparison of the host part, and
a character by character comparison of the rest
Additional consideration in here: The above remarks work for the http
and https URI schemes. They don't necessarily work for other schemes.
What's the plan of the scope of the pattern attribute for:
(a) additional schemes
(b) between http and https?
>> 4. How do you deal with trailing slashes?
>
> The path component is just a string — it has no structure. If if has
> a trailing slash, then only access to paths that begin with that
> path including its trailing slash is granted.
ok
>
>> 5. What is the use case for the wildcard mechanism? As I noted
>> before [*], the wildcard mechanism makes it fairly easy to scan
>> large network segments by inventing host names on the fly. I'd
>> prefer to simply drop that mechanism for the moment and keep things
>> really simple for v1. If that's not an option, can we please
>> define separate attribute names for patterns that imply access to
>> the entire network and patterns that imply access to resources at a
>> single host name only?
>
> The use case is many services (e.g. Google Maps) that serve from
> unpredictable subdomains, like www17.example.com or
> foo4.bar20.baz32.example.org.
>
> Is your proposal to have a separate attribute like
> subdomains="true"? In some ways I see how it could be clearer, but I
> don't really see how it changes the issue?
That's not precisely my proposal, but it would address the concern.
Received on Friday, 8 May 2009 13:52:58 UTC