Re: Last Call comments on IRI - 3.1 Mapping of IRIs to URIs

There is also the following Note:

   Note: The difference between Variants B and C in Step 1 (Variant B
       using normalization with NFC while Variant C not using any
       normalization) is to account for the fact that in many non-Unicode
       character encodings, some text cannot be represented directly.
       For example, Vietnam is natively written "Việt Nam"
       (containing a LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW
       in NFC, but a direct transcoding from the windows-1258 character
       encoding leads to "Việt Nam" (containing a LATIN SMALL
       LETTER E WITH CIRCUMFLEX followed by a COMBINING DOT BELOW),
       whereas direct transcoding of other 8-bit encodings of Vietnamese
       may lead to other representations.

Would moving this closer to the A/B/C variants, and maybe adding
some text, be a solution to your last call comment?

Regards,     Martin.


At 14:50 04/08/18 +0900, Martin Duerst wrote:

>Hello Chris,
>
>Many thanks for your comment. I have made it issue why-not-normalize-42
>(see http://www.w3.org/International/iri-edit#why-not-normalize-42).
>
>A few ideas on how to deal with it below.
>
>At 22:22 04/08/11 +0200, Chris Lilley wrote:
>
>>Hello ,
>>
>> > If the IRI is in an Unicode-based character encoding (for example
>> > UTF-8 or UTF-16): Do not normalize. Apply Step 2 directly to the
>> > encoded Unicode character sequence.

>>I believe that I understand why this step says 'do not normalize'
>>(otherwise, certain Unicode strings couldnever be used in query parts,
>>for example).
>>
>>However, as the two preceding steps say 'normalize' and this step says
>>'do not normalize' the reader could be confused - or perhaps consider it
>>an 'obvious error'.
>>
>>Do not tease the reader like this. Please explain *why* at this stage no
>>normalization is performed.
>
>You definitely have a point. But as you have noticed, the explanations
>are already given elsewhere in the document. I think there are several
>things that can be done:
>
>- capitalize 'NOT', to make clear that this is not an 'obvious error'.
>- add a pointer to 5.3 Normalization
> 
>(http://www.w3.org/International/iri-edit/draft-duerst-iri.html#normaliza 
>(http://www.w3.org/International/iri-edit/draft-duerst-iri.html#normalization)
>- do both of the above
>
>Which one do you prefer? Do you think this is enough, or do you have
>some other idea (actual wording preferred)?
>
>
>Regards,    Martin.

Received on Tuesday, 24 August 2004 05:53:20 UTC