Re: q about gb 2312/gbk from Paul Hoffman / IMC on 2001-08-23 (ietf-charsets@w3.org from July to September 2001)

From: Paul Hoffman / IMC <phoffman@imc.org>
Date: Thu, 23 Aug 2001 11:52:56 -0700
To: Harald Tveit Alvestrand <harald@alvestrand.no>, "Craig R. Cummings" <Craig.Cummings@oracle.com>, Markus Scherer <markus.scherer@jtcsv.com>, charsets <ietf-charsets@iana.org>
Message-id: <p0510030eb7ab002ce9f1@[165.227.249.20]>

At 3:28 PM +0200 8/23/01, Harald Tveit Alvestrand wrote:
>query:
>are all these charset names you have seen used "in the wild" where 
>MIME charset names should be used, or are they charsets that you 
>know about which are used in some context, and you think there 
>should be registered names for them?
>
>this will have most influence on the "intended usage" section....

One really can't prove the negative about finding something in the 
wild, but here is a data point. Of all the mail archives that IMC and 
VPNC keep, the following is what appears as explicit charsets given 
in content-type lines (with a count of how many times):

big5:   20
default:   3
euc-kr:   40
gb2312:   61
iso-2022-jp:   303
iso-2022-kr:   9
iso-8859-1:   4817
iso-8859-15:   3
iso-8859-2:   39
iso-8859-7:   8
iso-8859-8:   4
iso-8859-9:   3
koi8-r:   142
ks_c_5601-1987:   1
standard:   2
unknown-8bit:   62
us-acsii:   4
us-ascii:   51491
utf-16be:   1
utf-7:   2
utf-8:   73
windows-1251:   12
windows-1252:   11
windows-1255:   2
windows-1257:   4
x-roman8:   1
x-unicode-2-0-utf-7:   1
x-unknown:   25
x-user-defined:   5

--Paul Hoffman, Director
--Internet Mail Consortium

Received on Thursday, 23 August 2001 15:44:54 UTC