Re: Accept-Charset support from Larry Masinter on 1996-12-06 (www-international@w3.org from October to December 1996)

From: Larry Masinter <masinter@parc.xerox.com>
Date: Thu, 5 Dec 1996 21:59:11 PST
To: christw@microsoft.com
CC: garym@softshore.com.au, www-international@w3.org, Alan_Barrett/DUB/Lotus.LOTUSINT@crd.lotus.com
Message-Id: <96Dec5.225911pdt."135"@palimpsest.parc.xerox.com>

# A "good" browser would need to send all or most of the charsets listed
# in ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets

Oh, certainly not. I suppose we should not have avoided the political
difficulties in the HTTP/1.1 spec, but clearly most of charset names
are completely inappropriate. For a while, there was a list of
'charset' tokens in HTTP, but it seemed like it was a more general
IANA/Charset issue than a HTTP one.

HTTP/1.0 gave a list:

     charset = "US-ASCII"
             | "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3"
             | "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6"
             | "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9"
             | "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR"
             | "UNICODE-1-1" | "UNICODE-1-1-UTF-7" | "UNICODE-1-1-UTF-8"
             | token

and the appendix of HTTP/1.1 includes a list of 'preferred names':

       "US-ASCII"
       | "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3"
       | "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6"
       | "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9"
       | "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR"
       | "SHIFT_JIS" | "EUC-KR" | "GB2312" | "BIG5" | "KOI8-R"

       "EUC-JP" for "EXTENDED_UNIX_CODE_PACKED_FORMAT_FOR_JAPANESE"

and I'm guessing the right place to fix this up for good is in the
final edition of:

ftp://ftp.isi.edu/internet-drafts/draft-freed-charset-reg-01.txt

Received on Friday, 6 December 1996 02:00:08 UTC