- From: Brian Smith <brian@briansmith.org>
- Date: Wed, 12 Nov 2014 00:40:40 -0800
- To: Anne van Kesteren <annevk@annevk.nl>
- Cc: "public-webappsec@w3.org" <public-webappsec@w3.org>
On Wed, Nov 12, 2014 at 12:08 AM, Anne van Kesteren <annevk@annevk.nl> wrote: > On Tue, Nov 11, 2014 at 11:36 PM, Brian Smith <brian@briansmith.org> wrote: >> I think you may be looking at the obsolete version of the spec (RFC >> 2616). This was fixed (not as completely as I would like) in the new >> version (RFC 7230). > > Not really. RFCs tend to rarely pay attention to the level of detail > that is required to implement a browser. I agree with you; that's what I meant by "not as completely as I would like." >> http://tools.ietf.org/html/rfc7230#section-3.2.4: >> >> Historically, HTTP has allowed field content with text in the >> ISO-8859-1 charset [ISO-8859-1], supporting other charsets only >> through use of [RFC2047] encoding. In practice, most HTTP header >> field values use only a subset of the US-ASCII charset [USASCII]. >> Newly defined header fields SHOULD limit their field values to >> US-ASCII octets. A recipient SHOULD treat other octets in field >> content (obs-text) as opaque data. > > Sure, but what does this mean for implementations? E.g. how we handle > (using \0xXX to denote a byte) > > Location: /\0x80 > > is surely not going to change, or is it? > > As far as I know that needs to be treated *identical* to > > Location: /%80 > > Now maybe that matches "treat as opaque data", but it does mean that > \0x80 needs to become U+0080 before being handed to the URL parser (as > in "the real world" it operates on code points). If you get a garbage Location like that for anything other than a redirect, you just ignore it. When you get a garbage Location like that for a redirect, you probably should just show an error page, though you'd have to do a survey of browser implementations to know for sure what to do. As far as parsing the Content-Security-Policy header is concerned, I think the CSP specification is generally doing something reasonable for invalid characters (as defined by the RFC 3986 syntax). In particular, if the URL isn't a valid RFC 3986 URL then the browser will skip the CSP directive without ever feeding the URL to the HTML5 URL normalizer/decoder. In other words, when processing URLs in HTTP headers, in general you need to deal with the URL according to RFC 3986 rules at the HTTP level, and deal with the URL using HTML5 rules at the HTML level. That means, in particular, that the HTML5 URI parsing/decoding algorithms need to be able to handle all RFC 3986 URLs, even if such URLs are not possible in HTML5. And, it also means that there needs to be a way to convert every HTML5 URL into a valid RFC 3986 URL for the cases where you need to emit an HTML5 URL in an HTTP request. I think the main question is whether normalization (including URL decoding) and comparison should be done in the HTTP layer (using RFC 3986 rules) or in the HTML layer (using HTML5 rules). I believe the answer, for this particular case, is that normalization and comparison needs to be done in the HTML layer and not in the HTTP layer, but right now the CSP spec is wrongly mixing the two. Cheers, Brian
Received on Wednesday, 12 November 2014 08:41:10 UTC