- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Thu, 21 May 2009 19:44:15 +0200
- To: Anne van Kesteren <annevk@opera.com>
- CC: Dan Connolly <connolly@w3.org>, www-tag@w3.org
Anne van Kesteren wrote: > On Thu, 21 May 2009 19:05:52 +0200, Julian Reschke > <julian.reschke@gmx.de> wrote: >> I think when we discussed this last October, Larry and several others >> (including myself...) pointed out that the additional complexity as >> compared to IRIs (RFC3987) can easily be layered *above* IRI, mapping >> HTML5-references to IRIs by just by stating: > > Just for the record, around the same time I pointed out that this could > not work because of Step 1b in section 3.1 of RFC 3987. This may or may > not be a bug in RFC 3987, but it is most definitely an issue. I apologize that I keep forgetting this issue; for the record it is this one b. If the IRI is in some digital representation (e.g., an octet stream) in some known non-Unicode character encoding, convert the IRI to a sequence of characters from the UCS normalized according to NFC. -- <http://tools.ietf.org/html/rfc3987#section-3.1> ...which is weird, because the normalization is only enforced on non-Unicode encodings. Seems this needs to be discussed in the context of IRIbis. >> 1) non-IRI characters found in the query part are encoded using the >> document's character encoding, then percent-escaped (*) > > In addition, for this to work you'd have to define how to get the "query > part" first. The part between the first "?" and the first "#", as far as I can tell. > ... BR, Julian
Received on Thursday, 21 May 2009 17:45:01 UTC