- From: Markus Scherer <markus.scherer@jtcsv.com>
- Date: Thu, 13 Nov 2003 10:23:22 -0800
- To: charsets <ietf-charsets@iana.org>
Francois Yergeau wrote: > Respecting UTF-16 vs ISO-10646-UCS-2 however, there is a real difference, > the latter being restricted to U+FFFF. Yes, there is a real difference. However, more often than not, "UCS-2" just means "byte serialization of the internal 16-bit Unicode/ISO 10646 form", and as the generating software upgrades to handle surrogate pairs, the text really is UTF-16. Also, most receiving software will byte-unserialize a "UCS-2" byte stream into 16-bit Unicode, and if it handles surrogate pairs, then interpret it as UTF-16 anyway. In other words, in practice, the difference between UCS-2 and UTF-16 is in processing the text, not in encoding/converting it. markus -- Opinions expressed here may not reflect my company's positions unless otherwise noted.
Received on Thursday, 13 November 2003 13:24:57 UTC