- From: Martin Duerst <duerst@w3.org>
- Date: Tue, 24 Aug 2004 13:47:00 +0900
- To: Chris Lilley <chris@w3.org>, public-iri@w3.org
- Cc: www-tag@w3.org, Ted Hardie <hardie@qualcomm.com>
There is also the following Note: Note: The difference between Variants B and C in Step 1 (Variant B using normalization with NFC while Variant C not using any normalization) is to account for the fact that in many non-Unicode character encodings, some text cannot be represented directly. For example, Vietnam is natively written "Việt Nam" (containing a LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW in NFC, but a direct transcoding from the windows-1258 character encoding leads to "Việt Nam" (containing a LATIN SMALL LETTER E WITH CIRCUMFLEX followed by a COMBINING DOT BELOW), whereas direct transcoding of other 8-bit encodings of Vietnamese may lead to other representations. Would moving this closer to the A/B/C variants, and maybe adding some text, be a solution to your last call comment? Regards, Martin. At 14:50 04/08/18 +0900, Martin Duerst wrote: >Hello Chris, > >Many thanks for your comment. I have made it issue why-not-normalize-42 >(see http://www.w3.org/International/iri-edit#why-not-normalize-42). > >A few ideas on how to deal with it below. > >At 22:22 04/08/11 +0200, Chris Lilley wrote: > >>Hello , >> >> > If the IRI is in an Unicode-based character encoding (for example >> > UTF-8 or UTF-16): Do not normalize. Apply Step 2 directly to the >> > encoded Unicode character sequence. >>I believe that I understand why this step says 'do not normalize' >>(otherwise, certain Unicode strings couldnever be used in query parts, >>for example). >> >>However, as the two preceding steps say 'normalize' and this step says >>'do not normalize' the reader could be confused - or perhaps consider it >>an 'obvious error'. >> >>Do not tease the reader like this. Please explain *why* at this stage no >>normalization is performed. > >You definitely have a point. But as you have noticed, the explanations >are already given elsewhere in the document. I think there are several >things that can be done: > >- capitalize 'NOT', to make clear that this is not an 'obvious error'. >- add a pointer to 5.3 Normalization > >(http://www.w3.org/International/iri-edit/draft-duerst-iri.html#normaliza >(http://www.w3.org/International/iri-edit/draft-duerst-iri.html#normalization) >- do both of the above > >Which one do you prefer? Do you think this is enough, or do you have >some other idea (actual wording preferred)? > > >Regards, Martin.
Received on Tuesday, 24 August 2004 05:53:21 UTC