[whatwg] Charset sniffing from XML prolog from Boris Zbarsky on 2009-10-08 (public-whatwg-archive@w3.org from October 2009)

From: Boris Zbarsky <bzbarsky@MIT.EDU>
Date: Wed, 07 Oct 2009 22:10:33 -0400
Message-ID: <4ACD4A19.4040504@mit.edu>

On 10/7/09 9:52 PM, Kartikaya Gupta wrote:
> Anything else that might be affecting this?

In general, yes.  Charset info can come from the HTTP cache, from user 
bookmarks, etc, etc.

In this case, though, it's totally my fault: I just forgot that I had 
the HTML5 parser turned on locally.  Turning that off, I do get UTF-8, 
because of 
http://hg.mozilla.org/mozilla-central/file/603759afc77a/parser/htmlparser/src/nsParser.cpp#l2553 
and following.  That code is just bogus, in my somewhat biased 
opinion.... ;)

-Boris

Received on Wednesday, 7 October 2009 19:10:33 UTC