RE: Character Encoding Question

At 2:29 AM +0900 12/1/00, Martin J. Duerst wrote:
>There is no problem with UCS-2 and UCS-4. The UCS is a set
>(in the math sense) of characters, each with a number associated.
>There is only one UCS. Just saying 'UCS', there are no assumptions whatsoever
>about representation (UCS-2 and UCS-4 are both 'charset' labels), and
>no assumptions about subsetting (UCS-2 can be used, in the right context,
>to denote a certain subset of the UCS). So I don't see any problem.

I do. :-) "Non-Unicode" is not specific enough to prevent confusion, 
as this discussion has shown. Does it mean:
- all charsets except UTF-8, UTF-16, UTF-16BE, and UTF-16LE
- all charsets except UTF-8, UTF-16, UTF-16BE, UTF-16LE, UCS-2, UCS-4
- all charsets that are not defined by the Unicode Consortium in some 
version of the Unicode Standard
- something else

--Paul Hoffman, Director
--Internet Mail Consortium

Received on Thursday, 30 November 2000 13:30:49 UTC