Re: [csswg-drafts] [css-selectors-4] Include whitespace in non-optional `<combinator>` (#7027)

Thanks, I'm closing this issue then.

Just so I'm sure I understood correctly, and to explain a bit better the issue than in my initial comment (3 days later I had a hard time figuring it out myself, sorry):

```
Input: `svg|*`

 1. Match `svg|*` vs. `<complex-selector> = <compound-selector> [<combinator>? <compound-selector>]*`
 2. Match `svg|*` vs. ...sub-productions that finally includes `<type-selector>`
 3. Match `svg|*` vs. `<type-selector> = <wq-name> | <ns-prefix>? '*'`
 4. Match `svg|*` vs. `<wq-name> = <ns-prefix>? <ident-token>`
 5. Match `svg|*` vs. `<ns-prefix> = [ <ident-token> | '*' ]? '|'`
    - result: `svg|`
    - resume 4
 6. Match `*` vs. `<ident-token>`
    - result: fails
    - backtrack to 4 and discard the match for `<ns-prefix>?` (omitted)
 7. Match `svg|*` vs. `<ident-token>`
    - result: `svg`
    - resume 4 (end) then 3, then 2, then 1
 8. Match `|*` vs. `<combinator>?`
    - result: fails but optional, resume 1
 9. Match `|*` vs. ...sub-productions that finally includes `<type-selector>`
10. Match `|*` vs. `<type-selector> = <wq-name> | <ns-prefix>? '*'`
11. Match `|*` vs. `<wq-name> = <ns-prefix>? <ident-token>`
12. Match `|*` vs. `<ns-prefix> = [ <ident-token> | '*' ]? '|'`
    - result: `|` (omitted namespace prefix)
    - resume 11
13. Match `*` vs. `<ident-token>`
    - result: fails
    - backtrack to go back to 10 and discard the match for `<ns-prefix>?`
14. Match `*` vs. `<ns-prefix> = [ <ident-token> | '*' ]? '|'`
    - result: `|` (omitted namespace prefix)
    - resume 10
14. Match vs. `'*'`
    -> result: `*`

Results: 
  - `<compound-selector>` matches `svg`
  - `<combinator>?` is omitted
  - `<compound-selector>` matches `|*`
```

But because CSS matching against a grammar is implicitly defined to obey longest-match, instead of the above step 7, the parser should try with the second alternative for `<type-selector>` in `<wq-name> | <ns-prefix>? '*'`, even if a match for `<wq-name>` were found. If it would fail with `<ns-prefix>? '*'`, then it moves back and returns its initial match for `<wq-name>`, right?

I feel like this is fundamental and probably missing from the spec, as well as [these](https://github.com/w3c/csswg-drafts/issues/2921#issuecomment-902187106) [comments](https://github.com/w3c/csswg-drafts/issues/2921#issuecomment-902975958):

> parsing is non-greedy; if the first branch that starts to match eventually fails, you just move on to the second branch and try again

> Right, that's a backtracking parser vs a greedy/first-match parser. CSS grammars are intended for use with a backtracking parser.

-- 
GitHub Notification of comment by cdoublev
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/7027#issuecomment-1043924220 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Friday, 18 February 2022 05:30:00 UTC