- From: Etan Wexler <ewexler@stickdog.com>
- Date: Thu, 14 Nov 2002 06:40:12 -0500
- To: www-style@w3.org, Bert Bos <bert@w3.org>, Tantek Çelik <tantekc@microsoft.com>, Ian Hickson <ian@hixie.ch>, Håkon Wium Lie <howcome@opera.com>
Following are substantive comments on section 4, "CSS 2.1 syntax and basic data types" (<http://www.w3.org/TR/2002/WD-CSS21-20020802/selector.html>), of the Cascading Style Sheets level 2.1 draft (<http://www.w3.org/TR/2002/WD-CSS21-20020802>). 4.1.1 Tokenization "All levels of CSS -- level 1, level 2, and any future levels -- use the same core syntax." Well, level 1 restricts itself to the ISO-8859-1 character repertoire. That's a significant difference in syntax. And then there are issues of newline in strings and so on. "nmstart [_a-zA-Z]|{nonascii}|{escape}" To return to a favorite subject, I would still like to repeal the so-called correction that allowed unescaped underscore in identifiers. This change, which had some notable opposition, breaks backward compatibility with CSS1 and with any implementation of CSS2 predating the change. Unescaped underscore, promoted on the basis that XML names may contain underscores, are no more necessary than unescaped full stop, because XML names may contain full stops. We're doing fine without the latter; why do we need the former? Is "\_" truly an unbearable burden over and beyond "_"? I assert that CSS2.1 can make a break from this change and can return to the requirement of escaping underscores in identifiers. And I'm pleading for such a break. We can pretend that the erratum was an April Fools prank. "unicode \\[0-9a-f]{1,6}[ \n\r\t\f]?" To accomodate CRLF line breaks, change to "unicode \\[0-9a-f]{1,6}(\r\n|[ \n\r\t\f])?". "COMMENT tokens do not occur in the grammar (to keep it readable)" Readability is a canard. Observe how painlessly COMMENT tokens are explicitly added: stylesheet : [ CDO | CDC | b | statement ]+; b : [ S | COMMENT ]*; statement : ruleset | at-rule; at-rule : ATKEYWORD b any* [ block | ';' b ]; block : '{' b [ any | block | ATKEYWORD b | ';' b ]* '}' b; ruleset : selector? '{' b declaration? [ ';' b declaration? ]* '}' b; selector : any+; declaration : property ':' b value; property : IDENT b; value : [ any | block | ATKEYWORD b ]+; any : [ IDENT | NUMBER | PERCENTAGE | DIMENSION | STRING | DELIM | URI | HASH | UNICODE-RANGE | INCLUDES | FUNCTION any* ')' | DASHMATCH | '(' any* ')' | '[' any* ']' ] b; I added a single short production and replaced "S*" with "b". Is that truly difficult to read? In fact, adding comments explicitly lets us leave them out of certain places in a level-specific grammar, meaning in turn that the core grammar can be even smaller. So if we were to have the following productions in the CSS2.1 grammar ... percentage : [ '+' | '-' ]? NUMBER '%'; dimension : [ '+' | '-' ]? NUMBER IDENT; includes : '~' '='; function : IDENT '(' any* ')'; dashmatch : '|' '='; ... we could shorten the 'any' production as follows: any : [ IDENT | NUMBER | STRING | URI | DELIM | HASH | UNICODE-RANGE | '(' any* ')' | '[' any* ']' ] b; 4.1.2 Keywords "Other illegal examples:" ... "font-family: "serif";" But that's not illegal. Rather, that assigns the font family named "serif". It's probably not what misguided authors intend, but it's not illegal. 4.1.3 Characters and case "In CSS 2.1, identifiers (including element names, classes, and IDs in selectors) can contain only the characters [A-Za-z0-9] and ISO 10646 characters 161 and higher, plus the hyphen (-) and the underscore (_); they cannot start with a hyphen or a digit." Change to "In CSS 2.1, identifiers (including element names, classes, and IDs in selectors) can contain, unescaped, only the characters [A-Za-z0-9] and ISO 10646 characters 161 and higher, plus the hyphen-minus (-) and the underscore (_); they cannot start with an unescaped hyphen-minus or an unescaped digit [0-9]." "They can also contain" ... "any ISO 10646 character as a numeric code" This is false. ISO 10646 may have characters assigned to codepoints up to U+7FFFFFFF, whereas CSS2.1 deals in codepoints U+FFFFFF and below. "Note that Unicode is code-by-code equivalent to ISO 10646" This is true for now. When ISO 10646 ventures beyond Plane 16, however, this will cease to be true. "If a digit or letter follows the hexadecimal number, the end of the number needs to be made clear." Considering "digit" as any character in Unicode general category "Nd" and "letter" as any character in Unicode general categories "L*", this is false. Even restricting the terms to their meaning within the ASCII repertoire, this is false. The identifier \53top unambiguously corresponds to "Stop" because "t", while a letter, is not a hexadecimal digit. Change the wording to "If a character in the range [0-9a-zA-Z] follows the hexadecimal number, the end of the number needs to be made clear." "with a space (or other whitespace character): "\26 B" ("&B"). In this case, user agents should treat a "CR/LF" pair (13/10) as a single whitespace character." Change the first part to "with a space, with another whitespace character, or with the sequence of 'Carriage Return' (13) followed by 'Line Feed' (10):". Eliminate the second sentence. "Only one whitespace character is ignored after a hexadecimal escape." This is false according to the preceding passage. Change to "Only one whitespace character or the sequence of 'Carriage Return' (13) followed by 'Line Feed' (10) is ignored after a hexadecimal escape." 4.1.4 Statements "In this specification, the expressions "immediately before" or "immediately after" mean with no intervening whitespace or comments." The "Statements" section is an odd place to put this explanation. 4.1.8 Declarations and properties "A declaration is either empty or consists of a property, followed by a colon (:), followed by a value." Add "name" after "property". "A property is an identifier." Add "name" after "property". I must militate against the conflation of "property" and "property name". The name is a CSS identifier, a series of characters. The property is an object attached to an element or to a pseudo-element and consists of a name and a value. "The second declaration on the second line contains an undefined property 'font-vendor'." Change "contains" to "is of". 4.1.9 Comments 'Comments begin with the characters "/*" and end with the characters "*/".' Add 'Comments may contain any characters but must not contain the sequence "*/".' 4.2 Rules for handling parsing errors Missing here are rules for handling entities that do not match even the core syntax. Should a CSS processor ignore such entities? Should a CSS processor accept the part of the entity before the first core error? Answers are necessary for interoperability. Also missing are rules for handling an at-rule where the keyword is recognized but the following structure is not. What, for example, should or must a CSS1 processor do with media-specific '@import' at-rules? "User agents must ignore a declaration with an unknown property." Change "with" to "of". "keywords cannot be quoted in CSS 2.1" The addition of "2.1" is, frankly, frightening. The implication is that, in some future level of CSS, keywords may be quoted. If the implication is not desired, eliminate "2.1". If the implication is desired, then we have a topic deserving an entire and separate thread of discussion. 4.3.1 Integers and real numbers 'Both integers and real numbers may be preceded by a "-" or "+" to indicate the sign.' I have always assumed that whitespace may not intervene. Is my assumption correct? 4.3.2 Lengths "After the '0' length, the unit identifier is optional." Reading this strictly, I conclude that '0.0' is not a valid <length>. Change to "After the '0' length or equal, the unit identifier is optional." "Pixel units are relative to the resolution" Change "Pixel" to "'Px'". "the user agent should rescale pixel values" Change "pixel" to "'px'". "in: inches -- 1 inch is equal to 2.54 centimeters" Does the Working Group really wish to limit precision? "In cases where the specified length cannot be supported, user agents must approximate it in the actual value." Change "specified" to "computed". 4.3.5 Colors "Values outside the device gamut should be clipped" Add "when assigning actual values". 4.3.6 Strings "the following two selectors are exactly the same" Change "exactly the same" to "entirely equivalent". 4.4 CSS document representation "An HTTP "charset" parameter in a "Content-Type" field." What about a MIME parameter for use in mail? 4.4.1 Referring to characters not represented in a character encoding "If most of a document requires escaping" Change "document" to "style sheet".
Received on Thursday, 14 November 2002 07:11:41 UTC