W3C home > Mailing lists > Public > ietf-charsets@w3.org > October to December 2001

Re: Registering GBK and GB18030 in the IANA charset registry - draft nov09

From: Markus Scherer <markus.scherer@jtcsv.com>
Date: Mon, 12 Nov 2001 17:44:28 -0800
To: Anthony Fok <anthony@thizlinux.com>
Cc: ietf-charsets@iana.org, Kevin Lau <kevin@thizlinux.com>, Fai <fai@thizlinux.com>, Bruno Haible <haible@ilog.fr>, James Su <suzhe@turbolinux.com.cn>, Shouhua Wang <shwang@sonata.iscas.ac.cn>, Jian Wu <jwu@sonata.iscas.ac.cn>, Leon Zhang <leon@xteamlinux.com.cn>, Yu Guanghui <ygh@dlut.edu.cn>, Roger So <roger.so@sw-linux.com>, Pablo Saratxaga <pablo@mandrakesoft.com>, zhaoway <zw@debian.org>, Yu Mingjian <yumingjian@china.com>, Chen Xiangyang <chenxy@sun.ihep.ac.cn>, Dirk Meyer <dmeyer@adobe.com>, Ken Lunde <lunde@adobe.com>, li18nux2000@li18nux.org, bsd-locale@haun.org
Message-id: <3BF07AFC.8080909@jtcsv.com>
Anthony Fok wrote:

> Here is a first draft for the GB18030 registration.  Please edit it
> to make it look good.  :-)


You got comments... ;-)


> ...
> 3. A brief summary of the GB18030 codepoints is listed below:
> 
> 	1-byte:  {00-7E}		Same as ASCII (ISO 646)


Please change this to
	1-byte:  {00-7F}		Same as US-ASCII (ISO 646 IRV (1991))
                       ^                         ^^^               ^^^

The DEL control at 7F is included, and the reference to ASCII should be more specific.

The ISO standard defines not only the International Reference Version (IRV==US version) but also several national variants.


> ...
> 4. This registration request is specifically made since proper
>    registration will facilitate development of Operating Systems and
>    Internet-related software supporting GB18030.
> 
> 
> Registration Requirements (as defined in RFC 2278)
> ==================================================


--> Registration Requirements (as defined in RFC 2978)

The RFC for IANA charset registration was updated from RFC 2278 to RFC 2978 (ftp://ftp.isi.edu/in-notes/rfc2978.txt).

> ...
> 9. Usage and implementation requirements
> 
>    The GB 18030-2000 standard wasIt has been created by the Chinese


This line needs some editing... ;-)


>    government with inputs from major vendors and institutions to
>    ensure full ISO 10646 / Unicode 3.x compatibility while
>    providing a smooth migration path from GB2312 and GBK encodings.
> 
>    The proposed "GB18030" charset has been a mandatory standard in
>    Mainland China since August 31, 2001.  All operating systems
>    sold on or after that date must support GB18030.  Several major


As far as I know, the requirement date is "on or after September 1, 2001" or "after August 31, 2001".


>    operating systems already come with full GB18030 support, and
>    other vendors will finish the transition in coming months.
>    User-end applications will also fully support GB18030 in coming years.
>    Full GB18030 fonts have been available for some time now.
> 
>    GB18030 is also China's solution to support minority ethnic languages
>    such as Mongolian, Tibetan and Uyghur, as well as all other languages
>    of the world.  It also fulfills the needs of ancient Chinese literature
>    research, libraries, geography, names, and more.
> 
> 


Your draft is missing one item for the registration: You need to state that this charset's "Intended Usage" is "COMMON", for example at the beginning of your section 9. See the "Intended Usage" in "5. Charset Registration Template" in RFC 2978.

> 10. Publication requirements
> 
>     The proposed charset has been published by the Chinese Government in
>     March 2000.  The standard was revised in November 2000.  Newer revisions
>     will be published to follow updates in Unicode and ISO 10646 standards.
> 
>     Dirk Meyer <dmeyer@adobe.com> has kindly translated and provided
>     comments of the GB 18030-2000 standard in English.  While unofficial,
>     it is indeed the authorative document on GB18030.  An on-line copy
>     is available at:
> 
> 	ftp://ftp.oreilly.com/pub/examples/nutshell/cjkv/pdf/GB18030_Summary.pdf
> 
>     Markus Scherer (IBM) has also written some technical documentation:
> 
> 	http://oss.software.ibm.com/cvs/icu/~checkout~/charset/source/gb18030/gb18030.html
> 
>     The full XML definition of the GB18030 charset is defined here:
> 
> 	http://oss.software.ibm.com/cvs/icu/~checkout~/charset/data/xml/gb-18030-2000.xml


(Thanks for mentioning material from Dirk and me :-)
You might also want to point to my less technical overview article at http://oss.software.ibm.com/icu/docs/papers/gb18030.html

markus
Received on Monday, 12 November 2001 20:46:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 5 June 2006 15:10:52 GMT