- From: Andrew Cunningham <andrewc@vicnet.net.au>
- Date: Tue, 13 Oct 2009 00:12:39 +1000
- To: "Henri Sivonen" <hsivonen@iki.fi>
- Cc: "Andrew Cunningham" <andrewc@vicnet.net.au>, "Maciej Stachowiak" <mjs@apple.com>, "Ian Hickson" <ian@hixie.ch>, "Leif Halvard Silli" <xn--mlform-iua@xn--mlform-iua.no>, Mark Davis ☕ <mark@macchiato.com>, "Martin_J=2E_D=FCrst" <duerst@it.aoyama.ac.jp>, "Phillips, Addison" <addison@amazon.com>, "Richard Ishida" <ishida@w3.org>, "public-html@w3.org" <public-html@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>, "Larry Masinter" <masinter@adobe.com>
Thanks Henri, greatly appreciated. Useful data. Will be interesting to see what the trend will be in the future as the localisation effort builds up steam. although begs the question as to what happens with legacy encoded data in those languages, and with Vietnamese i'm still seeing bloggers using VNI, so still some content being produced in that encoding even today. not surprised with russian, japanese and ukranian, since legacy data may be in a few differnet encodings so heuristics makes sense. also not surprised by the indian localisations, had to be either utf-8 or win-1252. and guess win-1252 is a logical choice since firefox doesn't really support legacy encodings for Indian languages, and good percentage of legacy content in indian languages is misidentifying itself as iso-8859-1 or windows-1252 and relying on styling. On Mon, October 12, 2009 23:49, Henri Sivonen wrote: > > The Vietnamese localization of Firefox defaults to UTF-8 and no > heuristic detector: > http://mxr.mozilla.org/l10n-mozilla1.9.1/source/vi/toolkit/chrome/global/intl.properties > > For comparison, Japanese, Russian and Ukranian have a heuristic > detector turned on by default: > http://mxr.mozilla.org/l10n-mozilla1.9.1/source/ja/toolkit/chrome/global/intl.properties > http://mxr.mozilla.org/l10n-mozilla1.9.1/source/ru/toolkit/chrome/global/intl.properties > http://mxr.mozilla.org/l10n-mozilla1.9.1/source/uk/toolkit/chrome/global/intl.properties > > (Korean, Simplified Chinese and Traditional Chinese don't, BTW.) > > Query of interest: > http://mxr.mozilla.org/l10n-mozilla1.9.1/find?string=global%2Fintl.properties&tree=l10n-mozilla1.9.1&hint= > > In various Indian locales, the language itself does not use the Latin > alphabet but the default is still Windows-1252: > http://mxr.mozilla.org/l10n-mozilla1.9.1/source/hi-IN/toolkit/chrome/global/intl.properties > http://mxr.mozilla.org/l10n-mozilla1.9.1/source/bn-IN/toolkit/chrome/global/intl.properties > http://mxr.mozilla.org/l10n-mozilla1.9.1/source/gu-IN/toolkit/chrome/global/intl.properties > http://mxr.mozilla.org/l10n-mozilla1.9.1/source/pa-IN/toolkit/chrome/global/intl.properties -- Andrew Cunningham Research and Development Coordinator Vicnet State Library of Victoria Australia andrewc@vicnet.net.au
Received on Monday, 12 October 2009 14:13:19 UTC