- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Thu, 20 Aug 2009 10:27:54 +0300
- To: "Phillips, Addison" <addison@amazon.com>
- Cc: Maciej Stachowiak <mjs@apple.com>, "public-html@w3.org" <public-html@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
On Aug 20, 2009, at 10:22, Phillips, Addison wrote: >>> I think the world has changed significantly. In the past, setting >> a >>> default of UTF-8 in your browser produced mainly bad results. But, >>> at least according to some measures [1], UTF-8 is rapidly >> becoming >>> the most reasonable default encoding on the Web. >> [...] >>> [1] http://googleblog.blogspot.com/2008/05/moving-to-unicode- >> 51.html >> >> This shows an uptake in UTF-8, but it proves nothing without data >> on >> how much is labeled and how much unlabeled. Uptake in labeled UTF-8 >> is >> awesome but doesn't affect what makes sense as the default >> processing >> for unlabeled data. > > Ah.... but this data, I'm told, is based on the encoding *after > detection* by Google's crawler, not on the declaration. But does it also exclude pages that have encoding labels? Data about the frequency of users hitting unlabeled pages in particular encodings is the interesting this here. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Thursday, 20 August 2009 07:28:39 UTC