Re: Unicode distribution?

You can find Mark's presentation "Unicode at Google" on the right side
of http://macchiato.com/

I gathered the numbers for Mark. Mid-2006, the top 5 HTML META
charsets in Google's index were:

42.8%  iso-8859-1
20.4%  utf-8
8.12%  gb2312
3.97%  windows-1252
3.76%  windows-1251

In 2001, that distribution was:

47.9%  iso-8859-1
13.5%  windows-1252
7.30%  gb2312
5.47%  shift_jis
4.65%  utf-8

In 2001, 43.5% of the HTML documents had a META charset, while in
2006, that percentage was 72%.

Erik

On 1/5/07, John O'Conner <John.Oconner@sun.com> wrote:
>
> iris garden wrote:
> > Hi
> >
> > I want to ask please about the Unicode (utf-8) distribution on the
> > Internet, i.e. any statistics that shows the percentage of websites
> > world-wide that uses Unicode compared to other types of encoding?
> >
> > Thanks
> > Iris
>
> I believe that Mark Davis (working with both the Unicode Consortium and
> Google) may have provided some of that information in his recent Unicode
> Conference session. You might want to find and look at that session's
> slide material.
>
> Regards,
> John O'Conner

Received on Saturday, 6 January 2007 04:53:15 UTC