RE: Character Encoding Question

At 00/11/30 10:30 -0800, Paul Hoffman / IMC wrote:
>At 2:29 AM +0900 12/1/00, Martin J. Duerst wrote:
>>There is no problem with UCS-2 and UCS-4. The UCS is a set
>>(in the math sense) of characters, each with a number associated.
>>There is only one UCS. Just saying 'UCS', there are no assumptions whatsoever
>>about representation (UCS-2 and UCS-4 are both 'charset' labels), and
>>no assumptions about subsetting (UCS-2 can be used, in the right context,
>>to denote a certain subset of the UCS). So I don't see any problem.
>
>I do. :-) "Non-Unicode" is not specific enough to prevent confusion, as 
>this discussion has shown.

'non-unicode' is not part of the wording suggested.


>Does it mean:
>- all charsets except UTF-8, UTF-16, UTF-16BE, and UTF-16LE
>- all charsets except UTF-8, UTF-16, UTF-16BE, UTF-16LE, UCS-2, UCS-4
>- all charsets that are not defined by the Unicode Consortium in some 
>version of the Unicode Standard
>- something else

What we need is all charsets that are defined based on UCS. That would
include any RACE/LACE/..., if they every get defined as a charset,
and is completely independent of who defines it.

Regards,  Martin.

Received on Thursday, 30 November 2000 14:54:32 UTC