- From: by way of Martin Duerst <Xueming.Shen@Sun.COM>
- Date: Tue, 11 Nov 2003 17:41:10 -0500
- To: www-international@w3.org
Steve, They are "NEC Row 13" characters which are NOT part of jisx-x-208 but supported by different vendors for "compability" reason. See man eucJP on Solaris for details. Windows also have them mapped to their sjis's Row89-92. regards, sherma Steve Billings wrote: >Ienup: > >[I think i18n-prog may be more appropriate for this discussion that >www-international; can we move this discussion to i18n-prog?] > > > >>Many Roman numerals and circled numbers are a part of JIS X 0208 >> >I don't see them in the Unicode 4.0 JIS mapping tables (that's the latest >version I happen to have at my fingertips). Do you know their Unicode or JIS >codepoints? > >When I enter circle-1 from my Windows 2000 Japanese IME (choosing the >circled-1 character from the list of choices presented for "ichi") into a >text file (notepad), and save it as Unicode, it saves the Unicode character >U+2460. This Unicode character does not appear in any of the Unicode 4.0 JIS >mappings: JIS0201.txt, JIS0208.txt, JIS0212.txt, or SHIFTJIS.txt (Unicode >4.0 CD: \Mappings\EASTASIA\JIS). (To find a mapping for it, you need to go >to \Mappings\VENDORS\Microsoft\WINDOWS\CP932.txt.) > >So when at least some software such as Oracle, for example, tries to convert >that character for storing in a Shift-JIS or EUC database, it fails to find >a mapping, and replaces it with the substitution character. > >It's certainly conceivable that some software (like, apparently, Sourav's >telnet client if he was running it on Windows) does some round-trip mapping >other than what's shown in the Unicode 4.0 tables. I'd be very interested to >learn which JIS characters are being mapped to. Sourav: can you supply the >hex value of the EUC character you find in the text file when you enter >circle-1? > >Steve > >Steve Billings >Global 360 >Software Internationalization & Localization >http://www.global360.com/ >Office: 978-266-1604 >Cell: 978-697-8201 > >-----Original Message----- >From: www-international-request@w3.org >[mailto:www-international-request@w3.org]On Behalf Of Ienup Sung >Sent: Tuesday, November 11, 2003 12:41 PM >To: www-international@w3c.org >Subject: Re: [Fwd: Solaris box with ja as locale supports Roman numbers, >Circled numbers in Japanese strings] > > >Hello, > >Many Roman numerals and circled numbers are a part of JIS X 0208 >and also a part of SJIS and so any Japanese EUC and Shift_JIS/PCK locales >will support the characters and that includes Japanese locales in Solaris. >And ISO-2022-JP also has JIS X 0208. > >With regards, > >Ienup > > >] Subject: Solaris box with ja as locale supports Roman numbers, Circled >] numbers in Japanese strings >] Resent-Date: Mon, 10 Nov 2003 06:42:58 -0500 (EST) >] Resent-From: www-international@w3.org >] Date: Mon, 10 Nov 2003 02:42:41 -0500 >] From: souravm <souravm@infosys.com> (by way of Martin Duerst >] <duerst@w3.org>) >] To: www-international@w3.org >] >] >] >] >] >] Hi Steve (and all), >] >] I'm observing something funny in Solaris box related to the issue of >] support for Roman numbers and Circled numbers in Japanese string by >EUC-JP, >] which we discussed previously. >] >] I'm having a solaris box 2.8. There I'm setting ja as locale (LANG=ja, >] LC_ALL=ja) which is supposed to be EUC-Jp equivalent in Solaris. I'm >] accessing the Solaris box from a telnet client - there also I'm setting >the >] encoding as EUC-JP. >] >] Now I'm trying to type those circled numbers and Roman numbers through the >] telnet client in - a) Command Prompt, b) In a file opened in VI editor. >] >] The observation is - I'm successfully able to type (in both command prompt >] and VI editor) and store those characters (in VI editor). >] >] Based on our previous understanding EUC-JP is not supposed to support >these >] characters. In that case I don't know how do we rationalize above >] observation. >] >] Any clue ? >] >] Regards, >] Sourav >] >] -----Original Message----- >] From: Steve Billings [mailto:billings@global360.com] >] Sent: Thursday, October 23, 2003 2:48 AM >] To: souravm; www-international@w3.org >] Subject: RE: Query on Encoding supporting Roman numbers, Circled numbers >in >] Japanese strings >] >] Those characters are non-JIS-standard characters (therefore not in >] ISO-2022-JP or EUC-JP) that exist in Microsoft CP932 (the Japanese Windows >] codepage). In other words: yes, you are correct. >] >] Steve >] >] >] Steve Billings >] Global 360 >] Software Internationalization & Localization >] http://www.global360.com/ >] Office: 978-266-1604 >] Cell: 978-697-8201 >] >] -----Original Message----- >] From: www-international-request@w3.org >] [mailto:www-international-request@w3.org]On Behalf Of souravm (by way of >] Martin Duerst <duerst@w3.org>) >] Sent: Wednesday, October 22, 2003 12:17 PM >] To: www-international@w3.org >] Subject: Query on Encoding supporting Roman numbers, Circled numbers in >] Japanese strings >] >] >] >] >] Hi All, >] >] I've a simple application which accepts Japanese string from a HTML form >] and then show the same string in the response page. >] >] Now if I enter Roman characters like I, II, etc and Circled numbers like >] $B-!!"-"(B etc as a part of Japanese string, the string is properly >shown >back >] in response page when the encoding used is UTF-8. However, the same thing >] does not work in case of EUC_JP, Shift_JIS and ISO-2022-JP as encoding. >] >] I believe these characters are not supported in EUC_JP, Shift_JIS and >] ISO-2022_jp. Can anyone please confirm it ? >] >] Regards, >] Sourav >] >] >] > > >------------------------ Yahoo! Groups Sponsor ---------------------~--> >Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark >Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada. >http://www.c1tracking.com/l.asp?cid=5511 >http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/saFolB/TM >---------------------------------------------------------------------~-> > >To unsubscribe from this group, send an email to: >i18n-prog-unsubscribe@yahoogroups.com > > >Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ > >
Received on Tuesday, 11 November 2003 18:00:39 UTC