W3C home > Mailing lists > Public > public-i18n-geo@w3.org > July 2004

a few problems in O-charset-lang.html

From: Jungshik Shin <jshin@i18nl10n.com>
Date: Fri, 30 Jul 2004 11:48:17 +0900
Message-ID: <4109B6F1.1060103@i18nl10n.com>
To: www-international@w3.org, public-i18n-geo@w3.org

A recent email on Windows-31J led me to take a look at the

http://www.w3.org/International/O-charset-lang.html

There are a few problems with the document.

It lists a 7-year-old statistics (probably taken with a not-so-good 
sample even then) of the frequency of character encodings used on the 
web. The web and the internet have changed a lot since 1997 and I'm 
afraid the statistics gives a  misleading impression to some people that 
Windows-1252 can cover the vast majority of web pages. It'd be nice to 
replace that stat. with a recent one. If it's not easy to find a new 
statistics, I think either that part has to be removed or a prominent 
disclaimer should be added.

Another problem is that it uses 'kr' (the country code for Republic of 
Korea/South Korea) in place of 'ko' (the language code for Korean). 

I also found that Chinese (both zh-TW and zh-CN) is not listed (it's a 
partial list, but still not listing Chinese seems a bit strange.)

Jungshik
Received on Thursday, 29 July 2004 22:48:36 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:12:38 GMT