- From: Martin Duerst <duerst@w3.org>
- Date: Mon, 30 Jun 2003 16:00:31 -0400
- To: "Kurosaka, Teruhiko" <Teruhiko.Kurosaka@iona.com>, "LUNDER,BEN (HP-Australia,ex3)" <ben.lunder@hp.com>, <www-international@w3.org>
- Cc: "PETERSON,MARK (HP-Boise,ex1)" <mark.peterson@hp.com>
Some (late, sorry) additions: At 09:48 03/05/17 -0700, Kurosaka, Teruhiko wrote: >Ben, >It does make sense to use UTF-8 for your purpose where the >users are in a controlled environment, with one reservation and >I would like to get feedback from other members on this. > >If UTF-8 is used at the web browser level, the mapping between >the legacy encoding and UTF-8 depends on the browser and/or >the OS platform (if browser uses the conversion facility provided >by the OS platform). It is well known that certain characters in >Japanese computation map differently to Unicode (thus UTF-8) >depending on the OS/language platforms. >http://www.ingrid.org/java/i18n/unicode-utf8.html >For example, 0x5c in Shift JIS, which is supposed to mean >the Japanese currency YEN SIGN but acts like a backslash >(0x5c in ASCII), is treated as though it were >a regular backslash, and mapped to Unicode U+005C on >Windows but then displayed as a Yen sign on a Japanese system :-(. >but it is mapped to U+00A5 (YEN SIGN) on >MacOS. > >So the character that the user peceives the same are >handled and stored differently by the application, if we >take the approach to let the browser convert to UTF-8. >Supoose the (half-size) YEN SIGN is entered from the MacOS, >stored in the database. Later sobody view the data from >Windows, that data could be displayed as a square (meaning >the system cannot display this character). Browsers on windows systems should be able to display this character correctly. After all, this character is part of Latin-1, which is what the Web started with. >Has anyone experienced problems like this in reality? Do >popular browsers do code conversion by themselves, or >do they use OS facilities? I think it depends on the browser. Different browsers have different strategies of how much of the underlying platform they use. From another mail: > (2) Some HTML browser do not support UTF-8. (All popular > browsers for desktop support UTF-8 since a few years ago > but web-phone browser support only a legacy encoding. See > i-mode spec.) The newest mobile phones in Japan have started to support UTF-8. Regards, Martin.
Received on Monday, 30 June 2003 16:08:16 UTC