[selectors4][css3-syntax] should document.querySelector("html /* ") throw? (and some editorial comments)

(Bcc public-webapps as this case is only observable via the Selectors API)

I found an edge case where the relevant specs don't provide a consistent
answer. Consider,

    document.querySelector("html /* ")

should it throw or not?

The Selectors API calls the following production "selector string"[1]:

    dom_selectors_group
    : S* [ selectors_group ] S*
    ;

and use this term almost everywhere except this paragraph:

  # Selectors are evaluated against a given element in the context of
  # the entire DOM tree in which the element is located. If the given
  # group of selectors is invalid ([SELECTORS4], section 3.7), the
  # implementation must raise a SYNTAX_ERR exception
  # ([DOM-LEVEL-3-CORE], section 1.4).

which references Selectors 4 (by the way, the link still links to
Selectors 3 and should be updated).

If we only look at the formal syntax, then an open comment, whether that
leads to a BAD_COMMENT token, is not part of the dom_selectors_group and
hence the example should throw. But it's not clear if the wildcard in
CSS2.1 core grammar[2] :

  # Unexpected end of style sheet. User agents must close all open
  # constructs (for example: blocks, parentheses, brackets, rules,
  # strings, and comments) at the end of the style sheet. For example:

applies there, and I read the claim that if spec prose and formal
grammar disagrees, the prose trumps the formal. Also, The definition of
invalid selectors[3] in Selectors 4 has a line:

  # a selector containing an invalid simple selector, an invalid
  # combinator or an invalid token is invalid.

and it's also not clear if BAD_COMMENT counts as an invalid token.

IE9, FF9.0.1, Opera12alpha don't throw in this case, but Chromium 18 and
Safari 5.1 do throw. I don't have an opinion as to which is more correct
as this example is contrived anyway, but I think the spec should say
something about it, even if it's an "undefined".

== other editorial comments on selectors 4 (all syntax related) ==

In [3],

  # User agents must observe the rules for handling parsing errors:
  # * a simple selector containing an undeclared namespace prefix is invalid
  # * a selector containing an invalid simple selector, an invalid
combinator
  #   or an invalid token is invalid.
  # * a selector list containing an invalid selector is invalid.

this doesn't seem to have anything to do with error handling but
definitions of "invalid X". Some comments here:

1. If this section depends on the formal grammar then I think it should
say so. For example, "invalid combinator" isn't defined anywhere in the
spec except in the "combinator" production in the formal grammar.
2. There seems no definition of "invalid token". Is this term equivalent
to "unexpected token" or "a token which is not allowed at the current
parsing point."?

In [4]

  # The argument to :nth-child() must match the grammar below, where
  # INTEGER matches the token [0-9]+ and the rest of the tokenization
  # is given by the Lexical scanner in section 10.2:

, I am not an expert in parsers but shouldn't this be

  | The argument to :nth-child() must match the grammar below, where
  | INTEGER is a NUMBER token that also matches [0-9]+ and the rest
  | of the tokenization is given by the Lexical scanner in section 10.2:

?

The formal syntax section[5] says it needs to be updated, but I'll just
point out some problems now:

1. "compound_selector" is not defined. I am understand this correctly,
it should read "simple_selector_sequence".
2. "selector [ COMMA S* selector ]*" should read  "selector [ S* COMMA
S* selector ]*"
3. If selectors_group is allowed to have S* on both sides, I think the
Selectors API spec no longer needs dom_selectors_group.

[1] http://dev.w3.org/2006/webapi/selectors-api2/#selector-string
[2] http://www.w3.org/TR/CSS2/syndata.html#parsing-errors
[3] http://dev.w3.org/csswg/selectors4/#invalid
[4] http://dev.w3.org/csswg/selectors4/#nth-child-pseudo
[5] http://dev.w3.org/csswg/selectors4/#formal-syntax


Cheers,
Kenny

Received on Saturday, 4 February 2012 01:52:29 UTC