You can find Mark's presentation "Unicode at Google" on the right side of http://macchiato.com/ I gathered the numbers for Mark. Mid-2006, the top 5 HTML META charsets in Google's index were: 42.8% iso-8859-1 20.4% utf-8 8.12% gb2312 3.97% windows-1252 3.76% windows-1251 In 2001, that distribution was: 47.9% iso-8859-1 13.5% windows-1252 7.30% gb2312 5.47% shift_jis 4.65% utf-8 In 2001, 43.5% of the HTML documents had a META charset, while in 2006, that percentage was 72%. Erik On 1/5/07, John O'Conner <John.Oconner@sun.com> wrote: > > iris garden wrote: > > Hi > > > > I want to ask please about the Unicode (utf-8) distribution on the > > Internet, i.e. any statistics that shows the percentage of websites > > world-wide that uses Unicode compared to other types of encoding? > > > > Thanks > > Iris > > I believe that Mark Davis (working with both the Unicode Consortium and > Google) may have provided some of that information in his recent Unicode > Conference session. You might want to find and look at that session's > slide material. > > Regards, > John O'ConnerReceived on Saturday, 6 January 2007 04:53:15 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 20 September 2007 14:34:23 GMT