- From: Andrei Polushin <polushin@gmail.com>
- Date: Tue, 18 Mar 2008 09:00:29 +0600
- To: www-style@w3.org
Bert Bos wrote: > On Wednesday 12 March 2008 13:41, Andrei Polushin wrote: >> 2008/3/12, fantasai wrote: >>> >>> I'll note that >>> 3cm-2cm >>> will be parsed as a single DIMENSION token >>> >> I propose changing the grammar around the {ident} as follows: >> [...] >> That is, the unit name cannot contain '-', unless that unit name >> >> starts with either '-' or '_', as described by [1] > > That will indeed make "3n-1" parse as DIMEN(3n) + DELIM(-) + NUMBER(1), > but parsers still have to check that the DIMEN is a whole number with > an "n" or "N"; and "n-1" remains an IDENT. (Of course, "n-1" won't > occur very often in practice. :-) ) Yes, I did also realize it while reading implementation notes at https://bugzilla.mozilla.org/show_bug.cgi?id=75375#c35 > So I think there is no benefit for nth-child. Yes and no. There is no direct benefit, but CSS expression syntax may still evolve that way. Looks like WebKit goes further, see [the excerpt of its grammar][2]: nth (-?[0-9]*n[\+-][0-9]+)|(-?[0-9]*n) That means that IDENT in the form (n|n-1|-n|-n-1) is not an IDENT, and DIMENSION in the form (3n|3n-1|-3n|-3n-1) is not a DIMENSION. They are NTH token there. It looks like a hack, it may require attention in other parts of grammar (esp. class selectors like .-n), but it works. > It also doesn't remove the need for a space after 'mod' in 'calc(10% > mod-2em)', although it avoids many other possible user errors: > calc(10em-2px). > > But it's a change to the core grammar, a very dangerous thing to do: That's right. I agree, and now suggest making those changes *locally*, to the expression grammar only. The complete proposal is as follows: 1. The BASIC TOKENIZER remains the same, as defined by CSS21. 2. In CSS3, certain parts of grammar may be locally parsed as expressions. Those parts must continue to be parseable by a CSS21-conformant parser. CSS21 parser should be able to skip over expressions, treating them as "any" production per CSS core syntax. 3. To parse expressions, CSS3 expression-aware parser should behave as if it creates an EXPR-TOKENIZER on the top of its BASIC TOKENIZER. CSS3 parser uses that EXPR-TOKENIZER to pull expression tokens. 4. The EXPR-TOKENIZER consumes tokens provided by the BASIC TOKENIZER, splits them according to certain rules, and yields them to a CSS3 expression-aware parser. The splitting rules are: 4.1. {ident} may not contain substring where MINUS SIGN is followed by DECIMAL DIGIT. Such {ident} is split around that MINUS SIGN symbol. Example: IDENT(abc-1em) => IDENT(abc) '-' DIMENSION(1em) 4.2. {ident} may not contain trailing MINUS SIGN. Such {ident} is split just before that trailing MINUS SIGN. Example: IDENT(abc-) => IDENT(abc) '-' 4.3. {ident} may not start with MINUS SIGN, unless that {ident} contains a non-trailing MINUS SIGN, as described by [1]. Such {ident} is split just after that starting MINUS SIGN. Example: IDENT(-abc) => '-' IDENT(abc) IDENT(-abc-) => '-' IDENT(abc) '-' IDENT(-abc-def) => IDENT(-abc-def) 4.4. The splitting rules above apply equally to any {ident} that is part of either IDENT or DIMENSION or FUNCTION tokens. 4.5. In addition, the {ident} that designates the measurement unit of the DIMENSION token, may not contain a MINUS SIGN, unless that {ident} itself starts with a MINUS SIGN, as described by [1]. Such {ident} is split around that MINUS SIGN symbol. Example: DIMENSION(3cm-2cm) => DIMENSION(3cm) '-' DIMENSION(2cm) DIMENSION(3-x-parsec) => DIMENSION(3-x-parsec) The formal grammar of the EXPR-TOKENIZER is as follows (it cannot be used directly, though): % alpha [_a-z]|{nonascii}|{escape} alnum [_a-z0-9]|{nonascii}|{escape} word {alpha}{alnum}* phrase {word}([-]{word})* prefixed [-]{word}[-]{phrase} ident {phrase}|{prefixed} unit {word}|{prefixed} % {num}{ident} {return DIMENSION;} {ident} {return IDENT;} {ident}"(" {return FUNCTION;} % Now I expect it to cover both the calc() and nth-child() syntax issues, while being fully backward compatible with the CSS21 core syntax. [1]: http://www.w3.org/TR/2007/CR-CSS21-20070719/syndata.html#vendor-keywords [2]: http://trac.webkit.org/projects/webkit/browser/trunk/WebCore/css/tokenizer.flex?rev=30069#L26 -- Andrei Polushin
Received on Tuesday, 18 March 2008 03:01:04 UTC