W3C home > Mailing lists > Public > www-international@w3.org > April to June 2003

RE: Why is UTF8 not being taken up in Asia Pacific for Public Websites?

From: Martin Duerst <duerst@w3.org>
Date: Mon, 30 Jun 2003 16:00:31 -0400
Message-Id: <4.2.0.58.J.20030630155413.04b03690@localhost>
To: "Kurosaka, Teruhiko" <Teruhiko.Kurosaka@iona.com>, "LUNDER,BEN (HP-Australia,ex3)" <ben.lunder@hp.com>, <www-international@w3.org>
Cc: "PETERSON,MARK (HP-Boise,ex1)" <mark.peterson@hp.com>

Some (late, sorry) additions:

At 09:48 03/05/17 -0700, Kurosaka, Teruhiko wrote:

>Ben,
>It does make sense to use UTF-8 for your purpose where the
>users are in a controlled environment, with one reservation and
>I would like to get feedback from other members on this.
>
>If UTF-8 is used at the web browser level, the mapping between
>the legacy encoding and UTF-8 depends on the browser and/or
>the OS platform (if browser uses the conversion facility provided
>by the OS platform).  It is well known that certain characters in
>Japanese computation map differently to Unicode (thus UTF-8)
>depending on the OS/language platforms.
>http://www.ingrid.org/java/i18n/unicode-utf8.html
>For example, 0x5c in Shift JIS, which is supposed to mean
>the Japanese currency YEN SIGN but acts like a backslash
>(0x5c in ASCII),  is treated as though it were
>a regular backslash, and mapped to Unicode U+005C on
>Windows

but then displayed as a Yen sign on a Japanese system :-(.


>but it is mapped to U+00A5 (YEN SIGN) on
>MacOS.
>
>So the character that the user peceives the same are
>handled and stored differently by the application, if we
>take the approach to let the browser convert to UTF-8.
>Supoose the (half-size) YEN SIGN is entered from the MacOS,
>stored in the database.  Later sobody view the data from
>Windows, that data could be displayed as a square (meaning
>the system cannot display this character).

Browsers on windows systems should be able to display this
character correctly. After all, this character is part of
Latin-1, which is what the Web started with.


>Has anyone experienced problems like this in reality?  Do
>popular browsers do code conversion by themselves, or
>do they use OS facilities?

I think it depends on the browser. Different browsers
have different strategies of how much of the underlying
platform they use.

 From another mail:

 > (2) Some HTML browser do not support UTF-8.  (All popular
 > browsers for desktop support UTF-8 since a few years ago
 > but web-phone browser support only a legacy encoding. See
 > i-mode spec.)

The newest mobile phones in Japan have started to support
UTF-8.


Regards,     Martin.
Received on Monday, 30 June 2003 16:08:16 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:00 GMT