- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Thu, 18 Feb 2010 23:34:25 +0100
- To: Maciej Stachowiak <mjs@apple.com>
- CC: Ian Hickson <ian@hixie.ch>, "public-html@w3.org" <public-html@w3.org>
On 18.02.2010 23:23, Maciej Stachowiak wrote: > > On Feb 18, 2010, at 2:06 PM, Julian Reschke wrote: > >> >> In this case it would mean removing the special case in step 3 of >> Section 2 of <http://www.w3.org/html/wg/href/draft>. So, instead of: >> >> "If w begins with either of: >> >> * a string matching the <scheme> production, followed by "://" >> * the string "//" >> >> then percent-encode any left or right square brackets (U+005B, U+005D, >> "[" and "]") following the first occurrence of "/", "?", or "#" which >> follows the first occurrence of "//". >> >> Otherwise, percent-encode all left and right square brackets." >> >> it would simply be: >> >> "Percent-encode all left and right square brackets." > > I believe percent-encoding all square brackets will break processing of > web addresses with an IPv6 IP address as the hostname. It needs to at > minimum not percent-escape them when they delimit the allowed syntax for > a URI authority IPv6 address. The percent-escaping is undone in step 6. > I suspect the URL you mentioned fails only as an accidental side effect > of trying to handle IPv6 addresses correctly. Potentially. I'm trying to find out why that special case is there, and whether it's really needed. After all, we were told "this is how things work in reality". As far as I can tell so far, all this double-escaping and un-escaping mess can be substituted by either - using the RFC 3986 regexp for parsing (<http://greenbytes.de/tech/webdav/rfc3986.html#rfc.section.B>), or - by expanding the set of allowable characters, as proposed in <http://tools.ietf.org/html/draft-ietf-iri-3987bis-00#section-7.2>. Best regards, Julian
Received on Thursday, 18 February 2010 22:35:24 UTC