- From: Mark Rogers via GitHub <sysbot+gh@w3.org>
- Date: Thu, 18 Jul 2019 21:03:44 +0000
- To: public-css-archive@w3.org
dd8 has just created a new issue for https://github.com/w3c/csswg-drafts: == [css-syntax-3] Input stream processing can calculate wrong encoding == There's a difference between the encoding calculated by the css-syntax-3 spec and the CSS 2.1/2.2 spec, demonstrated by this file: http://test.csswg.org/suites/css2.1/20110323/html4/support/at-charset-001.css It's served as Content-Type: text/css; charset=shift_jis. It also starts with a Shift_JIS byte sequence that happens to match the UTF-8 BOM (great test case) ef bb bf 2e e5 b9 b3 e5 92 8c 0d 0a 7b 0d 0a 20 |............{.. | CSS 2.1/2.2 specifies that Content-Type wins over any BOM: https://drafts.csswg.org/css2/syndata.html#charset css-syntax-3 uses the 'Decode' algorithm and says the decode algorithm gives precedence to a byte order mark (BOM), and only uses the fallback when none is found. https://drafts.csswg.org/css-syntax/#input-byte-stream This CSS 2.x algorithm gets the correct encoding for the test file (Shift_JIS) but the CSS 3 algorithm gets the wrong encoding (UTF-8). Chrome and Firefox both seem to use the 2.x method of calculating encoding for CSS. FWIW I think the 'Decode' algorithm works well with HTML (and XML) because they're most likely to begin with `<!DOCTYPE` `<!--comment` `<html` or whitespace so can't accidentally match the BOM with an ASCII compatible encoding. I think 'Decode' works less well for CSS which can start with any non-ASCII code points as part of a CSS selector. Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/4126 using your GitHub account
Received on Thursday, 18 July 2019 21:03:51 UTC