- From: MURATA Makoto <murata@apsdc.ksp.fujixerox.co.jp>
- Date: Mon, 07 Sep 1998 10:38:56 +0900
- To: ietf-charsets@iana.org
1. Summary I propose to clarify the registration of "Shift_JIS" and "Windows-31J". The coded character sets of "Shift_JIS" should be JIS X 0201:1997 and JIS X 0208:1997, and appendix 1 of JIS X0208:1997 should be explicitly referenced. The coded character sets of "Windows-31J" further contain NEC special characters (Row 13), NEC selection of IBM extensions (Rows 89 to 92), and IBM extensions (Rows 115 to 119). Code Page 932 should be explicitly referenced. Charset name(s): Shift_JIS (MS_Kanji,csShiftJIS) Windows-31J (csWindows31J) Published specification(s): Shift_JIS: JIS X0208:1997 Windows-31J: Microsoft Code Page 932 (ftp://ftp.unicode.org/Public/MAPPINGS/ VENDORS/MICSFT/WINDOWS/CP932.TXT) Person & email address to contact for further information: MURATA Makoto (murata@fxis.fujixerox.co.jp) 2. Background In 1982, a Japanese comany, "ASCII" invented Shift JIS. It was first used for MBASICplus, which was a variation of MS-BASIC. The software platform of MBASICplus was CP/M-86 and the hardware platform was MULTI-16 of Mitsubishi. The coded character set of Shift JIS was JIS X0201 + JIS X0208:1978 (formerly called JIS C6226:1978). In 1983, ASCII, Mitsubishi, Japan IBM, and Microsoft agreed to use Shift JIS for internal representation of Japanese text on top of personal computers. Later, many companies (e.g., NEC, Apple, DEC, and IBM) have adopted Shift JIS as a basis, but developed their own variations by introducing aditional characters. In 1997, JIS X 0208 standardized Shift JIS in its appendix 1, where it is clearly stated that the coded character set is JIS X0201 + JIS X0208:1997. The Unicode Consortium publishes a mapping table between Shift JIS and Unicode 1.1. The URL is: ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/JIS/SHIFTJIS.TXT. Again, the coded character set of Shift JIS in this mapping table is JIS X0201 + JIS X0208. Meanwhile, Microsoft developed a variation of Shift JIS (CP932). This variation contains NEC special characters (Row 13), NEC selection of IBM extensions (Rows 89 to 92), and IBM extensions (Rows 115 to 119). A mapping table between CP932 and Unicode is also available from the Unicode Consoritum. The URL is: ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT Many other companies have their own variation of Shift JIS. For example, the variation of Apple is available at: ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/JAPANESE.TXT The coded character sets of this variation have more than 300 extended characters, which are not compabtible with the variation of Microsoft. For information about other variations, see http://www.opengroup.or.jp/jvc/cde/sjis-e.html. 3. Current registration Charset "Shift_JIS" is registered as follows: >Name: Shift_JIS (preferred MIME name) >MIBenum: 17 >Source: A Microsoft code that extends csHalfWidthKatakana to include > kanji by adding a second byte when the value of the first > byte is in the ranges 81-9F or E0-EF. >Alias: MS_Kanji >Alias: csShiftJIS where csHalfWidthKatakana is registered as follows: >Name: JIS_X0201 [RFC1345,KXS2] >MIBenum: 15 >Source: JIS X 0201-1976. One byte only, this is equivalent to > JIS/Roman (similar to ASCII) plus eight-bit half-width > Katakana >Alias: X0201 >Alias: csHalfWidthKatakana Observe that "kanji" in the registration of "Shift_JIS" is unclear. (It could be JIS X0208, JIS X0212, Big 5, or whatever.) However, another charset, "Windows-31J", is registered as follows: >Name: Windows-31J >MIBenum: 2024 >Source: Windows Japanese. A further extension of csShiftJIS > to include several OEM-specific kanji extensions. > Like csShiftJIS, it adds a second byte when the value > of the first byte is in the ranges 81-9F or E0-EF. > PCL Symbol Set id: 19K >Alias: csWindows31J Clearly, "Windows-31J" is different from "Shift_JIS" and the difference is "OEM-specific kanji extensions". To me, the only reasonable interpretation of "OEM-specific kanji extensions" are NEC special characters, NEC selection of IBM extensions, and IBM extenions. Thus, "kanji" in the registration of "Shift_JIS" should read JIS X0208 graphic characters. 4. Proposed revision I believe that the CCS's of the MIME charset "Shift_JIS" should be JIS X0201 and JIS X0208. Since every vendor has its own variation of Shift JIS, we cannot adopt such a variation as the definition of "Shift_JIS". Rather, vendor-specfic extensions should be registered as separate charsets, if necessary. Here is my revision proposal. Name: Shift_JIS (preferred MIME name) MIBenum: 17 Source: This charset is an extension of csHalfWidthKatakana by adding graphic characters in JIS X 0208. The CCS's are JIS X0201:1997 and JIS X0208:1997. The complete definition is shown in Appendix 1 of JIS X0208:1997. This charset can be used for the top-level media type "text". Alias: MS_Kanji Alias: csShiftJIS Name: Windows-31J MIBenum: 2024 Source: Windows Japanese. A further extension of csShiftJIS to include NEC special characters (Row 13), NEC selection of IBM extensions (Rows 89 to 92), and IBM extensions (Rows 115 to 119). The CCS's are JIS X0201:1997, JIS X0207:1998, and these extensions. This charset can be used for the top-level media type "text". PCL Symbol Set id: 19K Alias: csWindows31J Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp
Received on Sunday, 6 September 1998 18:37:05 UTC