- From: Dan Oscarsson <Dan.Oscarsson@kiconsulting.se>
- Date: Mon, 15 Apr 2002 08:30:33 +0200 (CEST)
- To: kenw@sybase.com
- Cc: ietf-charsets@iana.org, tony@att.com
> >And as you can see by my just cited quotation from 10646 itself, such >argumentation was always a kind of shell game by detractors of UTF-16 >and Unicode. The people making such arguments were not plugged in to >the process in ISO and were apparently unaware that WG2 itself was >keenly aware of the interoperability problems and eager to ensure that >all UTF's for 10646 were *equally* applicable to all characters encoded >in the standard. > >And the repeated concerns about the "eventual allocation" of characters >in the 32-bit codespace that UTF-16 could not handle have reached >the status of urban legends -- endlessly repeated among those in the >Linux community who use repetition to define accuracy, without bothering >to check with the source. I am sure UTF-16 could be expanded with an other surrogate space to handle all of original UCS (all 31 bits). I general I think is is wrong to restrict the available 31 bits of UCS into the UTF-16 space just because Unicode did the wrong choice from the beginning by using only 16 bits. UTF-8 can encode much more than UTF-16 code space. Though UTF-16 programs will not be able to handle all of them. It is no different from me using a 8-bit code space having to encode or discard all character outside code values 0-255. Dan
Received on Monday, 29 April 2002 05:00:16 UTC