- From: Erik van der Poel <erik@netscape.com>
- Date: Fri, 12 Jul 1996 11:04:48 -0700
- To: Olle Jarnefors <ojarnef@admin.kth.se>
- Cc: iesg@ietf.org, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Olle Jarnefors wrote: > > b) It's unclear what charset registration the preferred > MIME name "GB2312" shall designate: the ISO-registered > character set GB_2312-80 (MIBenum: 57) or the only > incompletely described GB2312 (MIBenum: 2025), which > of course already has the proposed preferred MIME name > as it's principal name. It is also possible that these > two registrations actually refer to exactly the same > character set, and should be merged in the IANA registry. No. While I agree that the name "gb2312" is lousy ("euc-cn" would have been better), the descriptions provided with the IANA registrations for gb2312 and gb_2312-80 are: Name: GB_2312-80 [RFC1345,KXS2] MIBenum: 57 Source: ECMA registry Alias: iso-ir-58 Alias: chinese Alias: csISO58GB231280 Name: GB2312 MIBenum: 2025 Source: Chinese for People's Republic of China (PRC) mixed one byte, two byte set: 20-7E = one byte ASCII A1-FE = two byte PRC Kanji See GB 2312-80 PCL Symbol Set Id: 18C Alias: csGB2312 Since the GB_2312-80 entry mentions iso-ir-58, it is clear that this is a single character set in the ISO 2022 sense. I.e. this charset does not include the single-byte ASCII characters. The gb2312 entry explicitly mentions single-byte ASCII, so it is clear that this is referring to the charset actually used on the net, whose name might more properly be "euc-cn" (to follow euc-kr's and euc-jp's lead). By the way, there is an RFC (1922) that proposes even more names for some of the same things: ftp://ds.internic.net/rfc/rfc1922.txt This document mentions "cn-gb" and "cn-big5", which appear to be the same as "gb2312" and "big5" respectively, but they have not been added to the IANA registry yet. Anarchy and chaos rule. Erik
Received on Friday, 12 July 1996 11:12:45 UTC