W3C home > Mailing lists > Public > www-international@w3.org > July to September 2005

Re: Encoding in the HTML/HTTP header

From: Jon Hanna <jon@hackcraft.net>
Date: Wed, 07 Sep 2005 11:05:48 +0100
Message-ID: <431EBB7C.6000508@hackcraft.net>
To: www-international@w3.org

Martin Duerst wrote:
> 
> Two possibilities I can immagine:
> 
> - The document is XML-based, the browser recognizes this, and
>   the uses the UTF-8 default for XML documents.
> - The browser analyses the byte sequences in the document and
>   heuristically detects that the document looks like UTF-8.
>   The chances for detecting UTF-8 correctly go up very quickly
>   even with only very few non-ASCII characters.

And goes up massively if the stream begins with a BOM (though using a 
BOM with UTF-8 has other issues).
Received on Wednesday, 7 September 2005 10:02:52 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:05 GMT