RE: Why is UTF8 not being taken up in Asia Pacific for Public Websites?

Ben,
It does make sense to use UTF-8 for your purpose where the
users are in a controlled environment, with one reservation and
I would like to get feedback from other members on this.

If UTF-8 is used at the web browser level, the mapping between
the legacy encoding and UTF-8 depends on the browser and/or
the OS platform (if browser uses the conversion facility provided 
by the OS platform).  It is well known that certain characters in
Japanese computation map differently to Unicode (thus UTF-8)
depending on the OS/language platforms.
http://www.ingrid.org/java/i18n/unicode-utf8.html
For example, 0x5c in Shift JIS, which is supposed to mean
the Japanese currency YEN SIGN but acts like a backslash
(0x5c in ASCII),  is treated as though it were
a regular backslash, and mapped to Unicode U+005C on
Windows but it is mapped to U+00A5 (YEN SIGN) on
MacOS.  
                          
So the character that the user peceives the same are
handled and stored differently by the application, if we
take the approach to let the browser convert to UTF-8.
Supoose the (half-size) YEN SIGN is entered from the MacOS,
stored in the database.  Later sobody view the data from
Windows, that data could be displayed as a square (meaning
the system cannot display this character). 

Has anyone experienced problems like this in reality?  Do
popular browsers do code conversion by themselves, or
do they use OS facilities?

T. "Kuro" Kurosaka
Internationalization Architect
teruhiko.kurosaka@iona.com
-------------------------------------------------------
IONA Technologies
2350 Mission College Blvd. Suite 650
Santa Clara, CA 95054
Tel: (408) 350 9684/9500 
Fax: (408) 350 9501
-------------------------------------------------------
Making Software Work Together TM

Received on Saturday, 17 May 2003 12:48:26 UTC