This HTML-compatible XHTML document, is encoded with the UTF-8 encoding and is also given a character encoding signature in the form of a Byte Order Mark (BOM). However, in contrast to this, the HTTP Content-Type: header coming from the Web server, claims (such is a least the plan ...) that the encoding of this document is ISO-8859-1.
For situations where two layers specifies different encoding, then XML 1.0 appendix F.2 recommends:
In the interests of interoperability, however, the following rule is recommended.
- If an XML entity is in a file, the Byte-Order Mark and encoding declaration are used (if present) to determine the character encoding.
For HTML, then at least Internet Explorer 8 and Webkit (Safari, Chrome) behave as recommended for XML 1.0: They respect the BOM more than they respect the HTTP Content-Type: header. They also respect the BOM more than a user's possible attempt to override the encoding, and for Webkit this goes for both XML and HTML. (I have not tested Internet Explorer version 9.)
For XML, then Opera and Firefox do not respect the BOM as much as the XML specification recommends. As a consquense, in face of an XML document with erroneous encoding info inside the HTTP Content-Type: header, then Firefox and Opera fires a draconian error messsage. For instance, this document has a HTTP Content-Type: header which says "ISO-8859-1", which - when this lable is respected, leads the parser to see some illegal characters befor the DOCTYPE. In contrast, Webkit browsers, which respect the XML recommendation, they do not display any draconian error message.
For HTML, again, the mis-interpretation of Opera and Firefox leads them to see 3 illegal characters before the DOCTYPE, which in turns sends them into quirks mode - this is an important reason for why user interaction and HTTP should be ignored whenever there is a BOM.