W3C home > Mailing lists > Public > www-style@w3.org > April 2012

Re: [selectors4][css3-syntax] an+b corner case

From: Boris Zbarsky <bzbarsky@MIT.EDU>
Date: Mon, 30 Apr 2012 13:23:53 -0400
Message-ID: <4F9ECAA9.9050603@mit.edu>
To: www-style@w3.org
On 4/30/12 12:56 PM, Tab Atkins Jr. wrote:
> Or, wait.  Actually, this grammar is busted.  It is impossible to scan
> the above, since "1n" will already have been tokenized as a DIMENSION.
>   Possibly browsers handle that implicitly, and *actually* match a
> grammar something like:
>
>    <dimension>  [ [ + | - ]<integer>  ]?
> |<keyword>  [ [ + | - ]<integer>  ]?
> |<integer>
> | odd
> | even
>
> ...with checks that the DIMENSION and KEYWORD are just 'n'.

Hey, our mails crossed.  ;)

And for what it's worth, it's even worse.  To quote the Gecko source 
comments from parsing this stuff:

  // The CSS tokenization doesn't handle :nth-child() containing - well:
  //   2n-1 is a dimension
  //   n-1 is an identifier

(And on a personal note, I'd like to point out that "-n-5" is likewise 
an IDENT token in the tokenizer.  Yay overloading of hyphen and minus!)

So in practice, what Gecko's parser does here, in pseudocode, is:

  t = nextToken();
  if (t.isIdent() or t.isDimension()) {
    if (t.ident starts with "n-") {
      push '-' and everything after it back into the scanner
      set t.ident to "n"
    } else if (t.ident starts with "-n-") {
      push second '-' and everything after it back into the scanner
      set t.ident to "-n"
    }
  }
  if (t.isIdent()) {
    // handle even and odd
    if ident.is("n") {
      set a = 1;
    } else if ident.is("-n") {
      set a = -1;
    }
  } else if (t.isDimension()) {
    // assert t.ident is "n"
    set a = t.integer;
  } else if (t.isInteger()) {
    set a = 0
    set b = t.integer;
    return;
  } else {
    fail();
  }
  // Now ask the tokenizer for the next token and handle it
  // being a '+' or '-' symbol or whatnot to compute the 'b'

The end result is that Gecko allows comments or whitespace _after_ the 
'n', and in cases when the 'n' is followed by a '-' makes the behavior 
as if there had in fact been a comment or whitespace between them.  But 
no comments or whitespace before the "n" allowed: that ends up falling 
into the t.isInteger() case above.  This could probably be fixed if it 
were really necessary, but I question how necessary it is, since no UA 
actually does that.  Fixing the spec would be preferable.

> So, this is a Selectors bug, not a browser bug.  We can't blame
> browsers for failing to do an impossible thing.  ^_^

Thanks.  ;)

-Boris
Received on Monday, 30 April 2012 17:24:24 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:20:52 GMT