W3C home > Mailing lists > Public > public-i18n-core@w3.org > October to December 2008

comment: upgrading to Unicode

From: Dan Chiba <dan.chiba@oracle.com>
Date: Tue, 28 Oct 2008 11:06:09 -0700
Message-ID: <49075491.3040301@oracle.com>
To: public-i18n-core@w3.org

FAQ: Upgrading from language-specific legacy encoding to Unicode encoding

http://www.w3.org/International/questions/qa-utf8-upgrade.en.php

"In addition, many legacy encodings for complex scripts are already 
double-byte, eg, Chinese."

"Characters that do not fall into the ASCII range, such as Chinese, 
Arabic, Russian, may use 2 or even 3 bytes. Chinese encodings already 
use more than 1 byte per character with legacy encodings, where they use 
double bytes."

These would be more accurate if revised to:

"In addition, many legacy encodings for complex scripts are already 
multibyte, eg, Chinese."

"Characters that do not fall into the ASCII range, such as Chinese, 
Arabic, Russian, may use 2, 3 or even 4 bytes. Chinese encodings already 
use more than 1 byte per character with legacy encodings, where they 
normally use 2 or 3 bytes."

Regards,
-Dan
Received on Tuesday, 28 October 2008 18:07:12 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 28 October 2008 18:07:13 GMT