W3C home > Mailing lists > Public > public-html@w3.org > October 2009

RE: Locale/default encoding table

From: Phillips, Addison <addison@amazon.com>
Date: Wed, 14 Oct 2009 10:18:49 -0400
To: Henri Sivonen <hsivonen@iki.fi>, Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
CC: Ian Hickson <ian@hixie.ch>, Geoffrey Sneddon <gsneddon@opera.com>, HTML WG <public-html@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Message-ID: <C7A5719F1E562149BA9171F58BEE2CA41298281928@EX-IAD6-B.ant.amazon.com>
> I rather suspect that UTF-8 isn't the best default for any locale,
> since real UTF-8 content is unlikely to rely on the last defaulting
> step for decoding. I don't know why some Firefox localizations
> default to UTF-8.

Why do you assume that UTF-8 pages are better labeled than other encodings? Experience suggests otherwise :-).

Although UTF-8 is positively detectable and several of us (Mark Davis and I, at least) have suggested making UTF-8 auto-detection a requirement, in fact, unless chardet is used, nothing causes unannounced UTF-8 to work any better than any other encoding.

The I18N WG pointed out that for many developing languages and locales, the legacy encodings are fragmented and frequently font-based, making UTF-8 a better default choice. This is not the case for a relatively well-known language such as Belarusian or Welsh, but it is the case for many minority and developing world languages.


Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

Received on Wednesday, 14 October 2009 14:19:22 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:45:00 UTC