Re: List of Japanese Shift_JIS characters which are not supported in Unicode

Souravm,

> Is there anywhere an exhaustive list of Japanese characters (especially 
> Shift_JIS characters) which are not supported in Unicode ?

I'm not 100% sure but I think all the characters in Shift_JIS,
being part of the national code set, are supported by Unicode.
But there are things you need to be careful about.


The mapping between Shift_JIS and Unicode defer platform to
platform.  This cuaes an interoperability problem.  See:

http://www.ingrid.org/java/i18n/unicode-utf8.html


If you mean Microsoft extension to Shift_JIS, code page 932,
rather than Shift_JIS proper as defined as part of JIS X 0208,
then there is a round-trip conversion issue because
code page 932 includes many duplicated characters in its
extension areas.  This paper summarizes this issue:

http://www.opengroup.or.jp/jvc/cde/ucs-conv-e.html


There aren't a perfect solution for these two issues.
You'd have to decide what to do depending on the
needs of specific applications.

-- 
KUROSAKA ("Kuro") Teruhiko, San Francisco, California, USA
Internationalization Consultant
http://www.bhlab.com/

Received on Monday, 11 October 2004 20:31:25 UTC