- From: Suzanne M. Topping <stopping@bizwonk.com>
- Date: Fri, 8 Nov 2002 13:16:10 -0500
- To: <www-international@w3.org>
> -----Original Message----- > From: souravm [mailto:souravm@infosys.com] > > I'm facing some problem in displaying a Japanese character, > WAVE DASH (縲 The wave character was discussed earliear this year on the Unicode and NELOCSIG lists. Here is the first of two notes from the Unicode mail archive which you may find useful: From: Kenneth Whistler (kenw@sybase.com) Date: Wed Feb 20 2002 - 14:23:16 EST Previous message: Marco Cimarosti: "RE: Unicode Search Engines" Maybe in reply to: Suzanne M. Topping: "RE: [nelocsig] Japanese wave character issue" Next in thread: David Hopwood: "Re: [nelocsig] Japanese wave character issue" Reply: David Hopwood: "Re: [nelocsig] Japanese wave character issue" Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ] Mail actions: [ respond to this message ] [ mail a new topic ] -------------------------------------------------------------------------------- Yep, that's right. This is one of the notorious small list of inconsistencies between various mappings of JIS X 0208: Microsoft Code Page 932 mapping: 0x8160 0xFF5E #FULLWIDTH TILDE Alternative JIS X 0208 Shift-JIS mapping (e.g. for the Mac): 0x8160 0x2141 0x301C # WAVE DASH Actually, the Unicode Consortium does not take (as yet) a formal position on which of these conversions is correct. Mapping tables are simply supplied by various vendors, and there may be inconsistencies in their interpretations of mappings. My *personal* opinion is that Microsoft has it right, as SJIS 0x8160 is treated as a fullwidth tilde in Japan, and is generally shown that way in widely available commercial fonts. When databases are doing roundtrip conversions through Unicode, they need to be aware of these exceptional cases in the conversions, precisely to avoid the kind of data corruption you are encountering. There is no simple, universal "fix" for this, since platforms do the conversions that they do, and other applications need to take into account the edge cases. The UTC has suggested an approach of documenting all the known issues, particularly for Shift-JIS mappings, the most problematical of the lot, but as yet no particular progress has been made on this suggestion. --Ken > The note below came through the NELOCSIG list, but I'm assuming someone > on this list may be able to give Laura some suggestions. > > -----Original Message----- > From: Nelson, Laura [mailto:lnelson@kenan.com] > Sent: Wednesday, February 20, 2002 1:04 PM > To: 'nelocsig@yahoogroups.com' > Subject: [nelocsig] Japanese wave character issue > > > > We have a situation where an important character, the Japanese "wave > character", is lost during transfers from various parts of our software. > The root cause is that Windows uses a different encoding than does the > rest of the world. > > Data is entered into our database by one program which uses the more > standard conversion to UTF8, and then read by another program using the > Windows version. It displays as garbage, because the wave character gets > lost in the conversion. > > There are other potential conversion issues with the same character, > because it is non-standard. > Does anyone have any suggestions? > The encodings in question are: > U+FF5E used by Windows > U+30-1C used by JIS X 0221, Unicode Consortium, Java (SJIS, EUCJIS, and > JIS), and Mac. > The SHIFT-JIS character is 0x8160
Received on Friday, 8 November 2002 13:16:12 UTC