I18N-ISSUE-438 (BUG24336): Encoding names should match what people actually call them [encoding] from Internationalization Working Group Issue Tracker on 2015-03-30 (public-i18n-core@w3.org from January to March 2015)

From: Internationalization Working Group Issue Tracker <sysbot+tracker@w3.org>
Date: Mon, 30 Mar 2015 13:25:35 +0000
To: public-i18n-core@w3.org
Message-Id: <E1YcZh9-000F2l-Oe@deneb.w3.org>

I18N-ISSUE-438 (BUG24336): Encoding names should match what people actually  call them [encoding]

http://www.w3.org/International/track/issues/438

Raised by: Richard Ishida
On product: encoding

https://www.w3.org/Bugs/Public/show_bug.cgi?id=24336

This issue tracks the bug listed above and was created as part of the WG CR process.

---

http://gsnedders.html5.org/web-encoding-names/results.html shows what
document.characterSet returns in current versions of browsers. Notably, Firefox
and Chrome both return the uppercased names for many of these. (IE returns them
all lowercase except "GB18030"; ZombieOpera returns them all lowercase)

Googling these encoding names it becomes clear that almost everyone refers to
"UTF-8", "ISO-8859-n", etc. (uppercased), and as there is no interop here
currently, and the proposed behaviour matches Firefox/Chrome, it would seem
better to just give them their names that are in common usage.

As such, I propose to change the names to the following (thereby changing case
only):

 - UTF-8
 - IBM866
 - ISO-8859-n
 - ISO-8859-8-I
 - KOI8-R
 - KOI8-U
 - HZ-GB-2312
 - Big5
 - EUC-JP
 - ISO-2022-JP
 - Shift_JIS
 - EUC-KR
 - UTF-16BE
 - UTF-16LE

Received on Monday, 30 March 2015 13:25:36 UTC