W3C home > Mailing lists > Public > www-international@w3.org > July to September 2001

Re: charset list

From: Bob Jung <bobj@netscape.com>
Date: Tue, 21 Aug 2001 13:48:39 -0700
Message-ID: <3B82C927.8020306@netscape.com>
To: "A. Vine" <avine@eng.sun.com>
CC: Michael Gorelik <mgorelik@Novarra.com>, www-international@w3.org
A. Vine wrote:

>...
>

>Mozilla (may also apply to Netscape 6)
>http://www.mozilla.org/projects/intl/chardet.html
>
A better list is to look at is from the source (isn't open source nice):
  
 http://lxr.mozilla.org/seamonkey/source/intl/uconv/src/charsetalias.properties#33

The above file has a long list of name-value pairs.  The names are the 
recognized charset values used by Mozilla and Netscape 6 -- these are 
what you are allowed to use in your HTML or XML charset specifications. 
 (The values are the internal names used by our Unicode converter.)

But to determine which charset name you use in your server or in your 
content, you should refer to the IANA list for the preferred internet 
name which is sometimes one fo the aliases and not necessarily the 
officially registered name.

>Mail client generated names and mail server recognized names can also be useful,
>but there are way too many of them to list, and this info is usually not readily
>available.
>
Mail charsets are trickier.  While the core mail RFCs allow any charset 
encoding, there are other RFCs (e.g., for Japanese) and other internet 
conventions which enourage the use of certain charsets for certain 
languages to enhance interoperability.  When creating new email, 
Mozilla/Netscape restricts the user to sending in charsets an accordance 
with these charsets.  For the charsets that Mozilla/Netscape allows in 
sending email, look here:
  
http://lxr.mozilla.org/seamonkey/source/xpfe/browser/resources/locale/en-US/navigator.properties#29

 intl.charsetmenu.mailedit=ISO-8859-1, ISO-8859-15, armscii-8, ISO-8859-4, ISO-8859-14, ISO-8859-2, GB2312, Big5, KOI8-R, windows-1251, KOI8-U, ISO-8859-7, ISO-2022-JP, EUC-KR, ISO-8859-10, ISO-8859-3, TIS-620, ISO-8859-9, UTF-8, VISCII  <http://lxr.mozilla.org/seamonkey/source/xpfe/browser/resources/locale/en-US/navigator.properties#30> 

You can also see this in the View menu of the mail compose window.

However, when reading email, Mozilla/Netscape is lenient and handles 
additional charsets.

>...
>Michael Gorelik wrote:
>
>>I can see that lots of japanese pages use x-sjis, x-jis, x-euc-jp charset.
>>However, I don't see those defined in IANA registry???
>>
When Netscape first implemented support for many of these charsets over 
6 years ago, they had not been registered with IANA.  The "x-" prefix is 
the standard MIME mechanism for "experimental" values.

-bob
Received on Tuesday, 21 August 2001 16:46:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:57 GMT