W3C home > Mailing lists > Public > www-international@w3.org > October to December 1997

Re: Serious bug on www.microsoft.com -> is GB_2312-80 widely accepted?

From: Sam Sun <ssun@CNRI.Reston.Va.US>
Date: Fri, 21 Nov 1997 14:27:24 -0500
To: <erik@netscape.com>
Cc: "www international" <www-international@w3.org>, "Unicode Discussion" <unicode@unicode.org>
Message-ID: <01bcf6b3$7edef0d0$29019784@ssun.CNRI.Reston.Va.US>
Thanks for clearing this up for me! I got confused because I had my
browser's default encoding set to GB2312.

So "gb_2312-80" is the correct charset encoding, and Front Page did it

One more problem though. It seems that Netscape Communicator doesn't
recognize the "gb_2312-80", but only "gb2312". IE4.0 supports both.

I just created a web page which can be used to test against these tags, and
it's at: http://ssun.cnri.reston.va.us/gb2312/index.html

So, is GB_2312-80 a widely accepted name?

I'm new to the list, please someone let me know if the question shouldn't be
raised here.


-----Original Message-----
From: Erik van der Poel <erik@netscape.com>
To: Sam Sun <ssun@CNRI.Reston.Va.US>
Cc: www international <www-international@w3.org>; Unicode Discussion
Date: Friday, November 21, 1997 12:25 PM
Subject: Re: Serious bug on www.microsoft.com

>The Internet charset registry is at:
>The GB 2312-related entries are as follows:
>Name: GB_2312-80                                        [RFC1345,KXS2]
>MIBenum: 57
>Source: ECMA registry
>Alias: iso-ir-58
>Alias: chinese
>Alias: csISO58GB231280
>Name: GB2312  (preferred MIME name)
>MIBenum: 2025
>Source: Chinese for People's Republic of China (PRC) mixed one byte,
>        two byte set:
>          20-7E = one byte ASCII
>          A1-FE = two byte PRC Kanji
>        See GB 2312-80
>        PCL Symbol Set Id: 18C
>Alias: csGB2312
>Name: HZ-GB-2312
>MIBenum: 2085
>Source: RFC 1842, RFC 1843                              [RFC1842, RFC1843]
>As you can see, "GB2312" is the name of the charset that also contains
>single-byte ASCII characters. This is the charset that is used in many
>including Web pages. GB_2312-80 has an alias "iso-ir-58", which means that
it is
>registration number 58 in ISO's registry, and this character set does not
>include single-byte ASCII characters, so this is not the charset that is
used on
>the Internet. The "HZ-GB-2312" charset is a 7-bit encoding of GB 2312, used
>some places such as Usenet newsgroups.
>Summary: "GB2312" is the correct name.
>(It is case-insensitive, so "gb2312" is also correct.)
>Sam Sun wrote:
>> There is a similar bug from Front Page 97, the Microsoft's web authering
>> tool.
>> When used to generate HTML documents using Simplified Chinese Character
>> encoding, it uses illegal charset name "gb_2312-80".
>> I believe the right charset name should be "gb-2312-80". Note that it's
>> a underscore
>> character between "gb" and "2312", but a hyphen character.
Received on Friday, 21 November 1997 14:24:42 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:41 UTC