Re: [css3-syntax] The "transform function whitespace" flag eats too much whitespace

On Sun, Jan 20, 2013 at 12:22 AM, Simon Sapin <simon.sapin@kozea.fr> wrote:
> Hi,
>
> The /transform function whitespace/ flag changes the  tokenizer so that
> `name (` is a single FUNCTION token instead of IDENT WS (.
>
> With `foo bar` however, the current ED’s state machine gives IDENT IDENT
> while it should give IDENT WS IDENT.
>
>
>> 3.3.14. Transform-function-whitespace state
>>
>> Consume the next input character.
>>
>> whitespace
>>     Remain in this state.
>> U+0028 LEFT PARENTHESIS (()
>>     Emit a function token with its value set to the identifer token's
>> value. Switch to the data state.
>> anything else
>>     Emit the ident token. Switch to the data state. Reconsume the current
>> input character.
>
>
> In the "anything else" case, the current input character (`b` in the `foo
> bar` example) is correctly reconsumed. But at this point all the whitespace
> is already consumed, so a WS token will be missing.
>
> Possible fixes:
>
> * Go back/reconsume one more character (which will be a whitespace
> character)
> * Emit a WS token after the ident.

Fixed it in a slightly different way, since neither of those options
are allowed in the self-imposed rules I've set on the tokenizer (you
can only reconsume the current character, you can only emit one token
before returning to the data state).

Instead, I run the entire state in look-ahead mode, so I can see when
I'm about to hit something that's not a parenthesis, emit the pending
ident token, and still have the whitespace in the current input
character so it can be reconsumed by the data state.

~TJ

Received on Sunday, 20 January 2013 19:31:02 UTC