Re: For Chinese, UTF-8 or UTF-16 encoding?

At 02:14 PM 3/12/02 +0900, Musale, Shailendra wrote:
>For Chinese localized files, should we use
>UTF-8 encoding or UTF-16 encoding?

There are two criteria:

o size
o interchange

Typical Chinese strings in UTF-8 would be 50% longer than in UTF-16.
This assumes that the *entire* text is in Chinese characters. If
the strings contain XML or HTML markup, for example, the the
percentage goes down.

If the recipient of the strings can handle UTF-16 as easily as
UTF-8, then size could be the sole criterion. This would be true
for storing message catalogs where the retrieving software could
perform conversions as necessary to serve each client what they
can handle.

A third criterion, processability, would need to be evaluated in
some cases, but seems to not apply for the situation mentioned.

A./

Received on Tuesday, 12 March 2002 01:50:35 UTC