- From: Suzanne M. Topping <stopping@bizwonk.com>
- Date: Fri, 8 Nov 2002 13:17:29 -0500
- To: <www-international@w3.org>
Hereis the second note from the Unicode mail archive: From: David Hopwood (david.hopwood@zetnet.co.uk) Date: Tue Feb 19 2002 - 13:02:20 EST Previous message: Marco Cimarosti: "RE: Unicode Search Engines" In reply to: Kenneth Whistler: "RE: [nelocsig] Japanese wave character issue" Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ] Mail actions: [ respond to this message ] [ mail a new topic ] -------------------------------------------------------------------------------- -----BEGIN PGP SIGNED MESSAGE----- Kenneth Whistler wrote: > Laura Nelson <lnelson@kenan.com> wrote: > > We have a situation where an important character, the Japanese "wave > > character", is lost during transfers from various parts of our software. > > The root cause is that Windows uses a different encoding than does the > > rest of the world. > > Yep, that's right. This is one of the notorious small list of > inconsistencies between various mappings of JIS X 0208: > > Microsoft Code Page 932 mapping: > > 0x8160 0xFF5E #FULLWIDTH TILDE > > Alternative JIS X 0208 Shift-JIS mapping (e.g. for the Mac): > > 0x8160 0x2141 0x301C # WAVE DASH > > Actually, the Unicode Consortium does not take (as yet) a formal > position on which of these conversions is correct. OTOH, the cross-reference note for U+301C WAVE DASH is: This character was encoded to match JIS C 6226-1978 1-33 "wave-dash". Subsequent revisions of the JIS standard and industry practice have settled on JIS 1-33 as being the fullwidth tilde character. --> 3030 wavy dash --> FF5E fullwidth tilde The Microsoft CP932 mapping is as correct as any of the other mappings for Shift_JIS, and better thought out than some of them. In this particular case, it's not "Windows against the world" - the ambiguities were in the original JIS standards. > > Data is entered into our database by one program which uses the more > > standard conversion to UTF8, and then read by another program using the > > Windows version. It displays as garbage, because the wave character gets > > lost in the conversion. That is probably an omission in the Windows *fallback* mappings, not the mapping to Unicode, then: U+301C should have a fallback mapping to Shift_JIS 0x8160. In any case, if it's a Unicode database then the "other program" should be displaying the field as Unicode, not trying to map it back to Shift_JIS, which will certainly fail in general. - -- David Hopwood <david.hopwood@zetnet.co.uk>
Received on Friday, 8 November 2002 13:17:30 UTC