Re: [csswg-drafts] [css-syntax] question: about ident-like URL consumption (#5416)

The slightly awkward algorithm ensures that, if it turns out that it needs to emit a function-token, and there was whitespace between the `(` and the `"`, it'll leave *one* character of whitespace for the tokenizer to pick up on the next pass so it can emit a whitespace token.

The tokenizer already collapses runs of adjacent whitespace into a single whitespace token, so the fact that I consumed a bunch of whitespace characters as part of producing the preceding token isn't observable. The benefit of this is that I don't need to do arbitrary lookahead from the `(` to discover if, after an arbitrary number of whitespace characters, I eventually run into a `"`; instead I only need to look two characters ahead.

(Overall, the tokenizer requires three characters of lookahead, and the parser requires one token of lookahead; keeping that minimal is good for the efficiency of implementations.)

-- 
GitHub Notification of comment by tabatkins
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/5416#issuecomment-672041799 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Tuesday, 11 August 2020 15:57:14 UTC