RE: Character Encoding Question

At 3:58 PM -0800 11/29/00, John Boyer wrote:
>Anyway, Jeff's point about UCS-2 != Unicode has now hit home thanks to some
>of the examples in the UTF-8 spec.  These examples clearly show triplets of
>UCS-2 values being used to form a single character, which does not appear to
>be permissible under UTF-16.

It appears that you are badly mis-reading the UTF-8 spec. None of the 
examples show "triplets of UCS-2 values being used to form a single 
character".

>   Since the Unicode manual is quite clear on the
>equivalence between Unicode and UTF-16 (p. 19), this would mean that UCS-2
>!= Unicode.

There isn't anything on p. 19 that says what you say. What p. 19 says 
is "The default encoding form of the Unicode Standard is 16-bit". A 
default encoding is far from saying that there is an equivalence.

>So it would seem that we need to include UCS-2 in the list of things that
>should not have NFC applied.

<sigh> I think I'm just going to let Martin sort this out; that's his job.

--Paul Hoffman, Director
--Internet Mail Consortium

Received on Wednesday, 29 November 2000 19:45:12 UTC