Re: Issue: CHARSET, character sets and charset

> > The following tokens are the initial preferred names to be used within
> > HTTP. This list includes those registered by RFC 1521 [7] -- the
> > US-ASCII [21] and ISO-8859 [22] character sets -- and other names
> > specifically recommended for use within MIME charset parameters.
> >   charset = "US-ASCII"
> >	| "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3"
> >	| "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6"
> >	| "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9"	
> >       | "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR"
> >       | "UNICODE-1-1-UTF-8"
> [** These tokens don't seem to correspond to the tokens used by most
> of the multilingual servers in the world today!!! Why don't we fix
> them to match? EUC, Shift-JIS, Big5, GB...  It was my impression that
> we might get IANA to assign a canonical and get out of this buisness.
> Why can't we do this? **]

It looks like this is a list of charset values used in other RFCs.

I'd also note that nearly all of them are defined by ISO standards.

It seems like a adding pragmatic list of national and/or corporate encodings
in wide use, would be a good thing for interoperability, though
less nice for standardiztion purists.

I'm not sure who has the expertise to tackle this for a wide range
of languages though.

What's involved in the IANA registration now?

It might be better to split this off into other internationalization
document(s) so as to not get bogged down.

Received on Monday, 8 April 1996 18:08:02 UTC