- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Wed, 01 Apr 2009 10:52:15 +0200
- To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
- CC: Lisa Dusseault <lisa.dusseault@gmail.com>, public-iri@w3.org, Mark Baker <mark@coactus.com>, apps-discuss@ietf.org
Martin J. Dürst wrote: > I agree with Julian that discussion about updates of RFC 3987 (IRI spec, > new version currently at draft-duerst-iri-bis-05) should go to > public-iri@w3.org, whereas bug reports and discussions about RFC 3986 > (URI spec) should go to uri@w3.org. > > But Lisa, you are the AD who eventually has to handle anything that > comes out of these drafts, so I'm glad to follow your directions. > > Julian, could you be more specific (pointer is sufficient) about what > you mean below with "potential issue (normalization)"? > > Regards, Martin. It's the one raised by Anne vK: Section 3.1 requires different IRI->URI behavior depending on the encoding of the source document: Step 1. Generate a UCS character sequence from the original IRI format. This step has the following three variants, depending on the form of the input: a. If the IRI is written on paper, read aloud, or otherwise represented as a sequence of characters independent of any character encoding, represent the IRI as a sequence of characters from the UCS normalized according to Normalization Form C (NFC, [UTR15]). b. If the IRI is in some digital representation (e.g., an octet stream) in some known non-Unicode character encoding, convert the IRI to a sequence of characters from the UCS normalized according to NFC. c. If the IRI is in a Unicode-based character encoding (for example, UTF-8 or UTF-16), do not normalize (see section 5.3.2.2 for details). Apply step 2 directly to the encoded Unicode character sequence. It would be helpful to understand where exactly this requirement comes from, and whether we have evidence it's being implemented (or even implementable; the source document encoding may not be known at the moment where the URI->IRI conversion occurs). BR, Julian
Received on Wednesday, 1 April 2009 08:53:11 UTC