- From: <bugzilla@jessica.w3.org>
- Date: Fri, 06 Jul 2012 04:32:37 +0000
- To: public-html-bugzilla@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=15359 --- Comment #13 from Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> 2012-07-06 04:32:35 UTC --- (In reply to comment #11) > http://203.59.75.251/Bug15359 > > A simple testcase has been done, and [latest release versions of] all major > browsers currently fail XML compliance due to this proposed handling of the BOM > (some non-browser XML processors get this right, though). Their behaviour *would have been* correct, if we changed XML to say this: ]] In the absence of information provided by an external transport protocol (e.g. HTTP or MIME) <INS> or a byte order mark</INS>, it is a fatal error for an entity including an encoding declaration to be presented to the XML processor in an encoding other than that named in the declaration, [[ As both HTTP or BOM are "external" to the markup such a change would makes sense. > Now, my position is unreservedly that, for compatibility with XML, the BOM must > not be specified as overriding all other considerations in all cases. [The > proposal for this Bug] There are many aspects of "compatibility with XML". The most important aspect is UTF-8, itself. Problem is: HTML defaults to Windows-1252. XML defaults to UTF-8. This means that, occationally, the HTML page can achieve an encoding - via default or by manual overriding - that differs from the author's intended encoding. The second important aspect of compatibility with XML is the fact that it's impossible to override the encoding of an XML document. We can have both of these benefits in HTML too, if only one uses the BOM. This benefit, however, comes at the expence of HTTP charset: The BOM must be allowed to override the HTTP charset. This is a price worth paying. Encodings is an evil. We should try remove their importance as much as possible. > As for overriding the HTTP Content-Type parameter specifically, or user > selection generally, my position is unchanged, for the reasons already given. I don't understand your reasons. You are CONTRA that the BOM overrides the HTTP charset. But you are PRO that the user can override the BOM. I see no benefit in that standpoint. I only see pessimism about the need for users to override encodings. NOTE: One reason that the BOM should override HTTP is that the BOM is likely to be more correct. (Plust that Webkit and IE alread behave like that.) If all browsers impoements IE and Webkit's behaviour, the encoding errors should not occur, and thus the user will have no need for overriding the encoding. > In particular, the spec. should remain silent on the subject of users > configuring their user agent to apply certain encodings to certain documents. > How this may impact XML (in terms of whether this would be valid in a > particular case) is unrelated to how it should impact (X)HTML5; to whatever > degree that it is already specified in the XML spec., leave it at that. > > Sometimes, users simply have to debug misdetected/misspecified encodings; the > fact that I've just demonstrated a new encoding-related misbehavior is proof of > that. You have documented a discrepancy between what browsers do and what XML specifies. You have not documented that what the browsers do lead to any problems. For instance, the test page you created above, works just fine. You have not even expressed any wish to override their encoding. So, I'm sorry, but the page you made does not demonstrate what you claim it to demonstrate. -- Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Friday, 6 July 2012 04:32:39 UTC