W3C home > Mailing lists > Public > public-css-archive@w3.org > August 2020

Re: [csswg-drafts] [css-syntax] question: about ident-like URL consumption (#5416)

From: Tab Atkins Jr. via GitHub <sysbot+gh@w3.org>
Date: Tue, 11 Aug 2020 15:57:12 +0000
To: public-css-archive@w3.org
Message-ID: <issue_comment.created-672041799-1597161430-sysbot+gh@w3.org>
The slightly awkward algorithm ensures that, if it turns out that it needs to emit a function-token, and there was whitespace between the `(` and the `"`, it'll leave *one* character of whitespace for the tokenizer to pick up on the next pass so it can emit a whitespace token.

The tokenizer already collapses runs of adjacent whitespace into a single whitespace token, so the fact that I consumed a bunch of whitespace characters as part of producing the preceding token isn't observable. The benefit of this is that I don't need to do arbitrary lookahead from the `(` to discover if, after an arbitrary number of whitespace characters, I eventually run into a `"`; instead I only need to look two characters ahead.

(Overall, the tokenizer requires three characters of lookahead, and the parser requires one token of lookahead; keeping that minimal is good for the efficiency of implementations.)

-- 
GitHub Notification of comment by tabatkins
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/5416#issuecomment-672041799 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config
Received on Tuesday, 11 August 2020 15:57:14 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 06:42:13 UTC