W3C home > Mailing lists > Public > public-html@w3.org > March 2008

Re: Charset usage data

From: Philip Taylor <pjt47@cam.ac.uk>
Date: Fri, 07 Mar 2008 15:33:19 +0000
Message-ID: <47D1603F.1060401@cam.ac.uk>
To: HTML WG <public-html@w3.org>

Philip Taylor wrote:
> http://philip.html5.org/data/charsets.html
> 
> [...] 
> 
> The encoding sniffing algorithm works significantly better with 1024 
> bytes (finds 92% of charsets) that with 512 (finds 82%).

I've now updated that page to use a version of the encoding sniffing 
that should match the current spec, and added 
http://philip.html5.org/data/encoding-detection.svg to show how the 
effectiveness varies with the number of bytes of content used in the 
algorithm.

-- 
Philip Taylor
pjt47@cam.ac.uk
Received on Friday, 7 March 2008 15:33:31 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:13 GMT