W3C home > Mailing lists > Public > www-international@w3.org > October to December 2003

Re: [i18n-prog] RE: [Fwd: Solaris box with ja as locale supports Roman numbers, Circled numbers in Japanese strings]

From: by way of Martin Duerst <Xueming.Shen@Sun.COM>
Date: Tue, 11 Nov 2003 17:41:10 -0500
Message-Id: <4.2.0.58.J.20031111174103.05ad5dd8@localhost>
To: www-international@w3.org




Steve,

They are "NEC Row 13" characters which are NOT part of jisx-x-208 but 
supported by
different vendors for "compability" reason. See man eucJP on  Solaris
for details. Windows
also have them mapped to their sjis's Row89-92.

regards,

sherma


Steve Billings wrote:

>Ienup:
>
>[I think i18n-prog may be more appropriate for this discussion that
>www-international; can we move this discussion to i18n-prog?]
>
>
>
>>Many Roman numerals and circled numbers are a part of JIS X 0208
>>
>I don't see them in the Unicode 4.0 JIS mapping tables (that's the latest
>version I happen to have at my fingertips). Do you know their Unicode or JIS
>codepoints?
>
>When I enter circle-1 from my Windows 2000 Japanese IME (choosing the
>circled-1 character from the list of choices presented for "ichi") into a
>text file (notepad), and save it as Unicode, it saves the Unicode character
>U+2460. This Unicode character does not appear in any of the Unicode 4.0 JIS
>mappings: JIS0201.txt, JIS0208.txt, JIS0212.txt, or SHIFTJIS.txt (Unicode
>4.0 CD: \Mappings\EASTASIA\JIS). (To find a mapping for it, you need to go
>to \Mappings\VENDORS\Microsoft\WINDOWS\CP932.txt.)
>
>So when at least some software such as Oracle, for example, tries to convert
>that character for storing in a Shift-JIS or EUC database, it fails to find
>a mapping, and replaces it with the substitution character.
>
>It's certainly conceivable that some software (like, apparently, Sourav's
>telnet client if he was running it on Windows) does some round-trip mapping
>other than what's shown in the Unicode 4.0 tables. I'd be very interested to
>learn which JIS characters are being mapped to. Sourav: can you supply the
>hex value of the EUC character you find in the text file when you enter
>circle-1?
>
>Steve
>
>Steve Billings
>Global 360
>Software Internationalization & Localization
>http://www.global360.com/
>Office: 978-266-1604
>Cell:    978-697-8201
>
>-----Original Message-----
>From: www-international-request@w3.org
>[mailto:www-international-request@w3.org]On Behalf Of Ienup Sung
>Sent: Tuesday, November 11, 2003 12:41 PM
>To: www-international@w3c.org
>Subject: Re: [Fwd: Solaris box with ja as locale supports Roman numbers,
>Circled numbers in Japanese strings]
>
>
>Hello,
>
>Many Roman numerals and circled numbers are a part of JIS X 0208
>and also a part of SJIS and so any Japanese EUC and Shift_JIS/PCK locales
>will support the characters and that includes Japanese locales in Solaris.
>And ISO-2022-JP also has JIS X 0208.
>
>With regards,
>
>Ienup
>
>
>] Subject: Solaris box with ja as locale supports   Roman numbers, Circled
>] numbers in Japanese strings
>] Resent-Date: Mon, 10 Nov 2003 06:42:58 -0500 (EST)
>] Resent-From: www-international@w3.org
>] Date: Mon, 10 Nov 2003 02:42:41 -0500
>] From: souravm <souravm@infosys.com> (by way of Martin Duerst
>] <duerst@w3.org>)
>] To: www-international@w3.org
>]
>]
>]
>]
>]
>] Hi Steve (and all),
>]
>] I'm observing something funny in Solaris box related to the issue of
>] support for Roman numbers and Circled numbers in Japanese string by
>EUC-JP,
>] which we discussed previously.
>]
>] I'm having a solaris box 2.8. There I'm setting ja as locale (LANG=ja,
>] LC_ALL=ja) which is supposed to be EUC-Jp equivalent in Solaris. I'm
>] accessing the Solaris box from a telnet client - there also I'm setting
>the
>] encoding as EUC-JP.
>]
>] Now I'm trying to type those circled numbers and Roman numbers through the
>] telnet client in - a) Command Prompt, b) In a file opened in VI editor.
>]
>] The observation is - I'm successfully able to type (in both command prompt
>] and VI editor) and store those characters (in VI editor).
>]
>] Based on our previous understanding EUC-JP is not supposed to support
>these
>] characters. In that case I don't know how do we rationalize above
>] observation.
>]
>] Any clue ?
>]
>] Regards,
>] Sourav
>]
>] -----Original Message-----
>] From: Steve Billings [mailto:billings@global360.com]
>] Sent: Thursday, October 23, 2003 2:48 AM
>] To: souravm; www-international@w3.org
>] Subject: RE: Query on Encoding supporting Roman numbers, Circled numbers
>in
>] Japanese strings
>]
>] Those characters are non-JIS-standard characters (therefore not in
>] ISO-2022-JP or EUC-JP) that exist in Microsoft CP932 (the Japanese Windows
>] codepage). In other words: yes, you are correct.
>]
>] Steve
>]
>]
>] Steve Billings
>] Global 360
>] Software Internationalization & Localization
>] http://www.global360.com/
>] Office: 978-266-1604
>] Cell:    978-697-8201
>]
>] -----Original Message-----
>] From: www-international-request@w3.org
>] [mailto:www-international-request@w3.org]On Behalf Of souravm (by way of
>] Martin Duerst <duerst@w3.org>)
>] Sent: Wednesday, October 22, 2003 12:17 PM
>] To: www-international@w3.org
>] Subject: Query on Encoding supporting Roman numbers, Circled numbers in
>] Japanese strings
>]
>]
>]
>]
>] Hi All,
>]
>] I've a simple application which accepts Japanese string from a HTML form
>] and then show the same string in the response page.
>]
>] Now if I enter Roman characters like I, II, etc and Circled numbers like
>] $B-!!"-"(B etc as a part of Japanese string, the string is properly
>shown
>back
>] in response page when the encoding used is UTF-8. However, the same thing
>] does not work in case of EUC_JP, Shift_JIS and ISO-2022-JP as encoding.
>]
>] I believe these characters are not supported in EUC_JP, Shift_JIS and
>] ISO-2022_jp. Can anyone please confirm it ?
>]
>] Regards,
>] Sourav
>]
>]
>]
>
>
>------------------------ Yahoo! Groups Sponsor ---------------------~-->
>Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
>Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada.
>http://www.c1tracking.com/l.asp?cid=5511
>http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/saFolB/TM
>---------------------------------------------------------------------~->
>
>To unsubscribe from this group, send an email to:
>i18n-prog-unsubscribe@yahoogroups.com
>
>
>Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
Received on Tuesday, 11 November 2003 18:00:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:03 GMT