Martin J. Dürst wrote: > I agree with Julian that discussion about updates of RFC 3987 (IRI spec, > new version currently at draft-duerst-iri-bis-05) should go to > public-iri@w3.org, whereas bug reports and discussions about RFC 3986 > (URI spec) should go to uri@w3.org. > > But Lisa, you are the AD who eventually has to handle anything that > comes out of these drafts, so I'm glad to follow your directions. > > Julian, could you be more specific (pointer is sufficient) about what > you mean below with "potential issue (normalization)"? > > Regards, Martin. It's the one raised by Anne vK: Section 3.1 requires different IRI->URI behavior depending on the encoding of the source document: Step 1. Generate a UCS character sequence from the original IRI format. This step has the following three variants, depending on the form of the input: a. If the IRI is written on paper, read aloud, or otherwise represented as a sequence of characters independent of any character encoding, represent the IRI as a sequence of characters from the UCS normalized according to Normalization Form C (NFC, [UTR15]). b. If the IRI is in some digital representation (e.g., an octet stream) in some known non-Unicode character encoding, convert the IRI to a sequence of characters from the UCS normalized according to NFC. c. If the IRI is in a Unicode-based character encoding (for example, UTF-8 or UTF-16), do not normalize (see section 5.3.2.2 for details). Apply step 2 directly to the encoded Unicode character sequence. It would be helpful to understand where exactly this requirement comes from, and whether we have evidence it's being implemented (or even implementable; the source document encoding may not be known at the moment where the URI->IRI conversion occurs). BR, JulianReceived on Wednesday, 1 April 2009 08:53:11 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 October 2009 06:33:36 GMT