- From: <bugzilla@jessica.w3.org>
- Date: Tue, 07 Jun 2011 11:53:07 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=12897 --- Comment #8 from Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> 2011-06-07 11:53:05 UTC --- (In reply to comment #7) > I believe you are misreading the XML 1.0 spec. It says that in the HTTP case, > RFC 3023 applies but for anyone specifying a new case, they recommend giving > XML itself precedence. However, since the RFC applies in the HTTP case, in the > HTTP case, the charset parameter on the HTTP level is authoritative. (1) It is already great if we agree that about the interpretation whenever HTTP is *not* used! (2) In that regard, HTML5 tends to talk about "the higher protocol" and not specifically about HTTP. (3) It is in the power of HTML5 spec to specify how XHTML5 and HTML5 document should be interpreted. Because: a) the HTML5 effort (including "sister projects") looks as redefining/refining the HTTP specs as well as HTML itself. b) XML 1.0 defers it: "the preferred method of handling conflict should be specified as part of the higher-level protocol used to deliver XML" c) XML 1.0 defines a recommended rule (which it probably would like to see in HTTP as well): "If an XML entity is in a file, the Byte-Order Mark and encoding declaration are used (if present) to determine the character encoding." But apart from what XML says, we must also look at interoperatibility - and the effects of Opera and Mozilla's reading of the specifications. I) In Mozilla's bugzilla there are several reports about how to handle the BOM gibberish letters whenever the BOM is ignored in favor of an external protocol. II) Opera has implemented a very strange behaviour were it sometimes eats the BOM gibberish, so that the page does not go in to quirks-mode, whereas sometimes it does not eat the BOM gibberish, leading to quirks mode. See my tests: http://malform.no/testing/html5/bom/ Et cetera: Yellow Screen of Death, IE/Webkit, wrong resulting encoding. I don't know if I misread Julian, but I'll also quote a message to Adam in 2009: [*] ]] > The algorithm tolerates leading white space, but not leading BOMs. Is there a particular reason why the BOM is not tolerated, given <http://www.w3.org/TR/REC-xml/#sec-guessing>? [ snipping in Julian's message ] Let's ignore "correctly" for a second -- [ snipping ] ]] [*] http://lists.w3.org/Archives/Public/public-html/2009Nov/0579 -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Tuesday, 7 June 2011 11:53:09 UTC