- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Mon, 22 Oct 2012 14:42:54 +0300
- To: www-style@w3.org
I started looking at changing Gecko to give precedence to the BOM for text/css. I noticed further problems. First of all, it appears that Gecko supports reading @charset that is encoded as BOMless UTF-16. In that case, it makes no sense for the stylesheet to declare an encoding other than the UTF-16 variant that matches the endianness of the 0x00 bytes intertwined in the @charset rule. However, Gecko seems to obey the declared encoding regardless of what is declared. Shockingly, this behavior seems to be what CSS 2.1 calls for, even though the behavior doesn't really make sense. Looking at http://www.w3.org/TR/CSS21/syndata.html#charset , it supports UTF-32 (weird endianness permutations even), EBCDIC and GSM 03.38 byte patterns. (Have all those *really* been tested to have two interoperable implementations for CSS 2.1?) Additionally, CSS3 Syntax doesn't appear to mention the inheritance of the encoding from the referring document in the absence of other encoding information. Please make the following changes to text/css (in addition to making the BOM take the highest precedence): * Please prohibit authors from using and implementations from supporting encodings that are not in the Encoding Standard. (http://encoding.spec.whatwg.org/) If normatively referencing the Encoding Standard is politically or procedurally infeasible, please at least prohibit implementations from supporting non-ASCII-compatible encodings other than variants of UTF-16. (See http://www.w3.org/TR/html5/infrastructure.html#ascii-compatible-character-encoding for a definition in the W3C space.) UTF-32, UTF-7, BOCU-1, SCSU, variants of EBCDIC and GSM 03.38 should all be banned from being supported by CSS implementations and from being used by CSS authors. * If there is no BOM, no @charset, no HTTP-level charset and no charset attribute on the linking element, and the encoding of the referring document or style sheet is ASCII-compatible, please define that the encoding is inherited from the referrer. If the encoding of the referrer is UTF-16, please define that the inherited encoding is UTF-8. * Please make the encoding declared using @charset have no effect unless the string "@charset" is represented as its ASCII bytes. * If it is determined that supporting BOMless UTF-16 that has @charset is needed for Web compatibility, please base the sniffing on the 0x00 bytes intertwined in "@charset" and not on whatever follows "@charset". (Even better if support for BOMless UTF-16 can be dropped.) -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Monday, 22 October 2012 11:43:23 UTC