Should the UTF-8 BOM trump overriding via HTTP or by users?

I'm interested in the www-internatonal's input on bug 12897, which has 
been filed agianst HTML5. [1]

That bug says that the UTF-8 BOM should trump "attempts" to override 
the document encoding via either HTTP or by the user. Failing to do 
ignore such "attempts", will bring the page into quirks-mode (for HTML) 
or yellow screan of death, for XML.

Per my reading of XML 1.0, in precence of external encoding info which 
conflicts with the internal encoding info as provided by either BOM or 
encoding declaration, then the parser should, quote "In the interests 
of interoperability", adhere to the BOM or the XML encoding 
declaration. [2]

I have explaned my reading of XML 1.0 in a Mozilla bug. [3] (I don't 
know what the XML working group meant, so it is just "just" my reading.)

The crux of the matter is that IE and Webkit behave as bug 12897 
behave. Whereas Firefox and Opera do not. IE and Webkit even *prevents* 
user to change the encoding whenever the encoding is UTF-8 and has a 
BOM. If they did not, then they would have allowed users to set page in 
quirks-mode and/or trigger yellow screen of death. The IE/Wenbkit 
behaviour thus makes sense.

I have also a test page where the different behaviour can be tested, 
and which I recommend you to test, even if you disagree with my reading 
of the spec and/or of the facts. [4]

[1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=12897

[2] http://www.w3.org/TR/xml/#sec-guessing-with-ext-info

[3] https://bugzilla.mozilla.org/show_bug.cgi?id=238694#c9

[4] http://malform.no/testing/html5/bom/

-- 
Leif Halvard Sillil

Received on Tuesday, 7 June 2011 03:26:47 UTC