Solaris box with ja as locale supports Roman numbers, Circled numbers in Japanese strings

Hi Steve (and all),

I'm observing something funny in Solaris box related to the issue of 
support for Roman numbers and Circled numbers in Japanese string by EUC-JP, 
which we discussed previously.

I'm having a solaris box 2.8. There I'm setting ja as locale (LANG=ja, 
LC_ALL=ja) which is supposed to be EUC-Jp equivalent in Solaris. I'm 
accessing the Solaris box from a telnet client - there also I'm setting the 
encoding as EUC-JP.

Now I'm trying to type those circled numbers and Roman numbers through the 
telnet client in - a) Command Prompt, b) In a file opened in VI editor.

The observation is - I'm successfully able to type (in both command prompt 
and VI editor) and store those characters (in VI editor).

Based on our previous understanding EUC-JP is not supposed to support these 
characters. In that case I don't know how do we rationalize above observation.

Any clue ?

Regards,
Sourav

-----Original Message-----
From: Steve Billings [mailto:billings@global360.com]
Sent: Thursday, October 23, 2003 2:48 AM
To: souravm; www-international@w3.org
Subject: RE: Query on Encoding supporting Roman numbers, Circled numbers in 
Japanese strings

Those characters are non-JIS-standard characters (therefore not in
ISO-2022-JP or EUC-JP) that exist in Microsoft CP932 (the Japanese Windows
codepage). In other words: yes, you are correct.

Steve


Steve Billings
Global 360
Software Internationalization & Localization
http://www.global360.com/
Office: 978-266-1604
Cell:    978-697-8201

-----Original Message-----
From: www-international-request@w3.org
[mailto:www-international-request@w3.org]On Behalf Of souravm (by way of
Martin Duerst <duerst@w3.org>)
Sent: Wednesday, October 22, 2003 12:17 PM
To: www-international@w3.org
Subject: Query on Encoding supporting Roman numbers, Circled numbers in
Japanese strings




Hi All,

I've a simple application which accepts Japanese string from a HTML form
and then show the same string in the response page.

Now if I enter Roman characters like I, II, etc and Circled numbers like
$B-!!"-"(B etc as a part of Japanese string, the string is properly shown back
in response page when the encoding used is UTF-8. However, the same thing
does not work in case of EUC_JP, Shift_JIS and ISO-2022-JP as encoding.

I believe these characters are not supported in EUC_JP, Shift_JIS and
ISO-2022_jp. Can anyone please confirm it ?

Regards,
Sourav

Received on Monday, 10 November 2003 06:42:53 UTC