[css-syntax] Changes from CSS 2.1 and Selectors 3


The current ED has some changes that are not noted in a Changes section. 
I agree with most of these changes, but they should still be noted, and 
might need a WG resolution.

I believe this should make the Changes sections complete for the 
2013-05-24 ED.

Section 3.2: Character encoding detection has changed a lot since 2.1. 
Changes include:

* Donít try to detect @charset with ASCII-incompatible patterns of bytes
* Ignore @charset if it specifies an ASCII-incompatible encoding (which 
would make the @charset rule itself decode as garbage.)
* Donít "ignore style sheets in unknown encodings." (whatever that 
means, since even 2.1 specfies UTF-8 as a fallback.)
* Refer to the WHATWG Encoding standard rather than IANA, and as a 
consequence: (These might not need to be listed explicitly.)
   - A BOM takes precedence over anything else.
   - Drop support for UTF-32, EBCDIC, IBM1026 and GSM 03.38.
   - Disallow supporting more than the specified a finite list of 
encodings and labels.
   - Specify decoding error handling. (The default, which css-syntax 
does not override, is to insert U+FFFD REPLACEMENT CHARACTER and recover.)

Any U+0000 character in the CSS source is replaced by U+FFFD. An 
hexadecimal escape that would decode as U+0000 (eg. \00) instead decodes 
as U+FFFD. CSS 2.1 makes one or both of these two cases explicitly 
undefined, although it is unclear which.

I think this covers the same security concerns that lead Mozilla to 
decode such escapes to U+0030 zero.

The definition of "non-ASCII" was changed from "U+00A0 and up" to the 
same as everyone outside of CSS, which is "U+0080 and up".

BAD_COMMENT tokens are now considered the same as normal comments, and 
neither are actually emitted by the tokenizer.

The <unicode-range> token now is more restrictive. Maybe it doesnít need 
to be, now that css3-fonts considers any "empty range" as invalid and 
drops the declaration. (Although I think we also still need a resolution 
on *that* change.)

EOF in the middle of a quoted string or url() in not an error anymore, 
and produces a <string> or <url> token rather than BAD_STRING or BAD_URI.

However such "bad" tokens were not actually errors in 2.1, according to 
the EOF error handling rule:


IMO this inconsistency in 2.1 is a bug that should be fixed, the way 
Syntax 3 does.

Lists of declarations now also accept at-rules. I support this change, 
see http://lists.w3.org/Archives/Public/www-style/2013Apr/0506.html

<an+b> is less restrictive with whitespace than in Selectors 3.

Simon Sapin

Received on Monday, 27 May 2013 08:26:05 UTC