- From: Maciej Stachowiak <mjs@apple.com>
- Date: Thu, 18 Feb 2010 14:46:43 -0800
- To: Julian Reschke <julian.reschke@gmx.de>
- Cc: Ian Hickson <ian@hixie.ch>, "public-html@w3.org" <public-html@w3.org>
Received on Thursday, 18 February 2010 22:47:19 UTC
On Feb 18, 2010, at 2:34 PM, Julian Reschke wrote: > On 18.02.2010 23:23, Maciej Stachowiak wrote: > >> I suspect the URL you mentioned fails only as an accidental side >> effect >> of trying to handle IPv6 addresses correctly. > > Potentially. I'm trying to find out why that special case is there, > and whether it's really needed. After all, we were told "this is how > things work in reality". > > As far as I can tell so far, all this double-escaping and un- > escaping mess can be substituted by either > > - using the RFC 3986 regexp for parsing (<http://greenbytes.de/tech/webdav/rfc3986.html#rfc.section.B > >), or > > - by expanding the set of allowable characters, as proposed in <http://tools.ietf.org/html/draft-ietf-iri-3987bis-00#section-7.2 > >. I think all the escaping and unescaping is there solely so these algorithms could be written as a layer on top of previous IRI/URI RFCs. I believe it would be better for IRIbis to define error-tolerant Web address parsing directly, rather than via escaping and then applying another algorithm. The regexp looks reasonable to me but I am not sure if there are mysterious edge cases. Regards, Maciej
Received on Thursday, 18 February 2010 22:47:19 UTC