- From: Ienup Sung <is@mpkmail.eng.sun.com>
- Date: Tue, 11 Nov 2003 17:31:04 -0800 (PST)
- To: i18n-prog@yahoogroups.com
- Cc: www-international@w3c.org
Sorry about the confusion, I mistaken the VDC characters in Solaris Japanese locales. (It has been a quite long time that I looked at the JIS X 0208.) With regards, Ienup ] Date: Tue, 11 Nov 2003 13:38:33 -0800 ] From: Xueming Shen <xueming.shen@sun.com> ] Subject: Re: [i18n-prog] RE: [Fwd: Solaris box with ja as locale supports Roman numbers, Circled numbers in Japanese strings] ] To: i18n-prog@yahoogroups.com ] Cc: Ienup Sung <Ienup.Sung@Eng.Sun.COM>, www-international@w3c.org ] MIME-version: 1.0 ] Content-transfer-encoding: 7bit ] X-Accept-Language: en-us, en ] ] Steve, ] ] They are "NEC Row 13" characters which are NOT part of jisx-x-208 but ] supported by ] different vendors for "compability" reason. See man eucJP on Solaris ] for details. Windows ] also have them mapped to their sjis's Row89-92. ] ] regards, ] ] sherma ] ] ] Steve Billings wrote: ] ] >Ienup: ] > ] >[I think i18n-prog may be more appropriate for this discussion that ] >www-international; can we move this discussion to i18n-prog?] ] > ] > ] > ] >>Many Roman numerals and circled numbers are a part of JIS X 0208 ] >> ] >> ] >I don't see them in the Unicode 4.0 JIS mapping tables (that's the latest ] >version I happen to have at my fingertips). Do you know their Unicode or JIS ] >codepoints? ] > ] >When I enter circle-1 from my Windows 2000 Japanese IME (choosing the ] >circled-1 character from the list of choices presented for "ichi") into a ] >text file (notepad), and save it as Unicode, it saves the Unicode character ] >U+2460. This Unicode character does not appear in any of the Unicode 4.0 JIS ] >mappings: JIS0201.txt, JIS0208.txt, JIS0212.txt, or SHIFTJIS.txt (Unicode ] >4.0 CD: \Mappings\EASTASIA\JIS). (To find a mapping for it, you need to go ] >to \Mappings\VENDORS\Microsoft\WINDOWS\CP932.txt.) ] > ] >So when at least some software such as Oracle, for example, tries to convert ] >that character for storing in a Shift-JIS or EUC database, it fails to find ] >a mapping, and replaces it with the substitution character. ] > ] >It's certainly conceivable that some software (like, apparently, Sourav's ] >telnet client if he was running it on Windows) does some round-trip mapping ] >other than what's shown in the Unicode 4.0 tables. I'd be very interested to ] >learn which JIS characters are being mapped to. Sourav: can you supply the ] >hex value of the EUC character you find in the text file when you enter ] >circle-1? ] > ] >Steve ] > ] >Steve Billings ] >Global 360 ] >Software Internationalization & Localization ] >http://www.global360.com/ ] >Office: 978-266-1604 ] >Cell: 978-697-8201 ] > ] >-----Original Message----- ] >From: www-international-request@w3.org ] >[mailto:www-international-request@w3.org]On Behalf Of Ienup Sung ] >Sent: Tuesday, November 11, 2003 12:41 PM ] >To: www-international@w3c.org ] >Subject: Re: [Fwd: Solaris box with ja as locale supports Roman numbers, ] >Circled numbers in Japanese strings] ] > ] > ] >Hello, ] > ] >Many Roman numerals and circled numbers are a part of JIS X 0208 ] >and also a part of SJIS and so any Japanese EUC and Shift_JIS/PCK locales ] >will support the characters and that includes Japanese locales in Solaris. ] >And ISO-2022-JP also has JIS X 0208. ] > ] >With regards, ] > ] >Ienup ] > ] > ] >] Subject: Solaris box with ja as locale supports Roman numbers, Circled ] >] numbers in Japanese strings ] >] Resent-Date: Mon, 10 Nov 2003 06:42:58 -0500 (EST) ] >] Resent-From: www-international@w3.org ] >] Date: Mon, 10 Nov 2003 02:42:41 -0500 ] >] From: souravm <souravm@infosys.com> (by way of Martin Duerst ] >] <duerst@w3.org>) ] >] To: www-international@w3.org ] >] ] >] ] >] ] >] ] >] ] >] Hi Steve (and all), ] >] ] >] I'm observing something funny in Solaris box related to the issue of ] >] support for Roman numbers and Circled numbers in Japanese string by ] >EUC-JP, ] >] which we discussed previously. ] >] ] >] I'm having a solaris box 2.8. There I'm setting ja as locale (LANG=ja, ] >] LC_ALL=ja) which is supposed to be EUC-Jp equivalent in Solaris. I'm ] >] accessing the Solaris box from a telnet client - there also I'm setting ] >the ] >] encoding as EUC-JP. ] >] ] >] Now I'm trying to type those circled numbers and Roman numbers through the ] >] telnet client in - a) Command Prompt, b) In a file opened in VI editor. ] >] ] >] The observation is - I'm successfully able to type (in both command prompt ] >] and VI editor) and store those characters (in VI editor). ] >] ] >] Based on our previous understanding EUC-JP is not supposed to support ] >these ] >] characters. In that case I don't know how do we rationalize above ] >] observation. ] >] ] >] Any clue ? ] >] ] >] Regards, ] >] Sourav ] >] ] >] -----Original Message----- ] >] From: Steve Billings [mailto:billings@global360.com] ] >] Sent: Thursday, October 23, 2003 2:48 AM ] >] To: souravm; www-international@w3.org ] >] Subject: RE: Query on Encoding supporting Roman numbers, Circled numbers ] >in ] >] Japanese strings ] >] ] >] Those characters are non-JIS-standard characters (therefore not in ] >] ISO-2022-JP or EUC-JP) that exist in Microsoft CP932 (the Japanese Windows ] >] codepage). In other words: yes, you are correct. ] >] ] >] Steve ] >] ] >] ] >] Steve Billings ] >] Global 360 ] >] Software Internationalization & Localization ] >] http://www.global360.com/ ] >] Office: 978-266-1604 ] >] Cell: 978-697-8201 ] >] ] >] -----Original Message----- ] >] From: www-international-request@w3.org ] >] [mailto:www-international-request@w3.org]On Behalf Of souravm (by way of ] >] Martin Duerst <duerst@w3.org>) ] >] Sent: Wednesday, October 22, 2003 12:17 PM ] >] To: www-international@w3.org ] >] Subject: Query on Encoding supporting Roman numbers, Circled numbers in ] >] Japanese strings ] >] ] >] ] >] ] >] ] >] Hi All, ] >] ] >] I've a simple application which accepts Japanese string from a HTML form ] >] and then show the same string in the response page. ] >] ] >] Now if I enter Roman characters like I, II, etc and Circled numbers like ] >] $B-!!"-"(B etc as a part of Japanese string, the string is properly ] >shown ] >back ] >] in response page when the encoding used is UTF-8. However, the same thing ] >] does not work in case of EUC_JP, Shift_JIS and ISO-2022-JP as encoding. ] >] ] >] I believe these characters are not supported in EUC_JP, Shift_JIS and ] >] ISO-2022_jp. Can anyone please confirm it ? ] >] ] >] Regards, ] >] Sourav ] >] ] >] ] >] ] > ] > ] >------------------------ Yahoo! Groups Sponsor ---------------------~--> ] >Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark ] >Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada. ] >http://www.c1tracking.com/l.asp?cid=5511 ] >http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/saFolB/TM ] >---------------------------------------------------------------------~-> ] > ] >To unsubscribe from this group, send an email to: ] >i18n-prog-unsubscribe@yahoogroups.com ] > ] > ] > ] >Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ ] > ] > ] > ] > ] ] ]
Received on Tuesday, 11 November 2003 20:31:47 UTC