- From: Jonas Sicking <jonas@sicking.cc>
- Date: Mon, 21 Jul 2008 00:39:18 -0700
- To: Ian Hickson <ian@hixie.ch>
- Cc: Maciej Stachowiak <mjs@apple.com>, Sunava Dutta <sunavad@windows.microsoft.com>, "annevk@opera.com" <annevk@opera.com>, Sharath Udupa <Sharath.Udupa@microsoft.com>, Zhenbin Xu <Zhenbin.Xu@microsoft.com>, Gideon Cohn <gidco@windows.microsoft.com>, "public-webapps@w3.org" <public-webapps@w3.org>, IE8 Core AJAX SWAT Team <ieajax@microsoft.com>
Ian Hickson wrote: > On Sun, 20 Jul 2008, Jonas Sicking wrote: >> Ian Hickson wrote: >>> On Sat, 19 Jul 2008, Jonas Sicking wrote: >>>> According to the HTML5 spec space is a valid characted inside URLs. >>> That wasn't intentional -- can you point to where it says that? The HTML5 >>> spec relies on spaces not being allowed in URLs in various places. >> In section 2.3.2 (Parsing URLs): >> >> # Add all characters with codepoints less than or equal to U+0020 or >> # greater than or equal to U+007F to the <unreserved> production. > > This is in the context of: > > # 2. Parse url in the manner defined by RFC 3986, with the following > # exceptions: > > It isn't defining what's allowed. What's allowed is defined in the earlier > section: > > # A URL is a valid URL if at least one of the following conditions holds: > # ... > > ...which basically just says it's a valid URL if it's a valid URI or IRI > (with some caveats in the case of IRIs to prevent legacy encoding > behaviour from handling valid URLs in a way that contradicts the IRI > spec). This doesn't allow spaces. Hmm.. I'm confused. From your and Maciejs answer it sounds like the algorithm doesn't specify what is valid, but what is parsed? What is the difference? What a 'AC header validator' would complain about? If so, that doesn't really buy much as far as forwards compatibility goes. We have to be backwards compatible with what UAs accept, not what validators accept. However doing something like what Maciej suggests, of stopping the url parser at the first whitespace character, sounds like it would solve the forwards compat issue. However, if the HTML5 algorithm only considers the same URLs valid as RFC 3986 does, is there a reason not to point directly to RFC 3986 instead? Seems like there is no reason to have more relaxed error handling here than needed? / Jonas
Received on Monday, 21 July 2008 07:40:51 UTC