W3C home > Mailing lists > Public > www-international@w3.org > October to December 2002

RE: Problem in showing Japanise Wave dash

From: Suzanne M. Topping <stopping@bizwonk.com>
Date: Fri, 8 Nov 2002 13:17:29 -0500
Message-ID: <427F53DA8F48E9498ADF0F868763F88C0F6A9B@wonkserver1.bizwonk.com>
To: <www-international@w3.org>

Hereis the second note from the Unicode mail archive:

From: David Hopwood (david.hopwood@zetnet.co.uk)
Date: Tue Feb 19 2002 - 13:02:20 EST 

Previous message: Marco Cimarosti: "RE: Unicode Search Engines" 
In reply to: Kenneth Whistler: "RE: [nelocsig] Japanese wave character issue" 
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ] 
Mail actions: [ respond to this message ] [ mail a new topic ] 

--------------------------------------------------------------------------------

-----BEGIN PGP SIGNED MESSAGE----- 

Kenneth Whistler wrote: 
> Laura Nelson <lnelson@kenan.com> wrote: 
> > We have a situation where an important character, the Japanese "wave 
> > character", is lost during transfers from various parts of our software. 
> > The root cause is that Windows uses a different encoding than does the 
> > rest of the world. 
> 
> Yep, that's right. This is one of the notorious small list of 
> inconsistencies between various mappings of JIS X 0208: 
> 
> Microsoft Code Page 932 mapping: 
> 
> 0x8160 0xFF5E #FULLWIDTH TILDE 
> 
> Alternative JIS X 0208 Shift-JIS mapping (e.g. for the Mac): 
> 
> 0x8160 0x2141 0x301C # WAVE DASH 
> 
> Actually, the Unicode Consortium does not take (as yet) a formal 
> position on which of these conversions is correct. 

OTOH, the cross-reference note for U+301C WAVE DASH is: 

  This character was encoded to match JIS C 6226-1978 1-33 
  "wave-dash". Subsequent revisions of the JIS standard and 
  industry practice have settled on JIS 1-33 as being the fullwidth 
  tilde character. 
  --> 3030 wavy dash 
  --> FF5E fullwidth tilde 

The Microsoft CP932 mapping is as correct as any of the other mappings 
for Shift_JIS, and better thought out than some of them. In this 
particular case, it's not "Windows against the world" - the ambiguities 
were in the original JIS standards. 

> > Data is entered into our database by one program which uses the more 
> > standard conversion to UTF8, and then read by another program using the 
> > Windows version. It displays as garbage, because the wave character gets 
> > lost in the conversion. 

That is probably an omission in the Windows *fallback* mappings, not the 
mapping to Unicode, then: U+301C should have a fallback mapping to Shift_JIS 
0x8160. In any case, if it's a Unicode database then the "other program" 
should be displaying the field as Unicode, not trying to map it back to 
Shift_JIS, which will certainly fail in general. 


- -- 
David Hopwood <david.hopwood@zetnet.co.uk> 
Received on Friday, 8 November 2002 13:17:30 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:59 GMT