W3C home > Mailing lists > Public > www-style@w3.org > April 2012

Re: [css3-syntax] Reviving the spec, starting with the parser

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Wed, 11 Apr 2012 22:36:53 -0700
Message-ID: <CAAWBYDDOFp1O+dUSYsFSqmHBrBa5AYNSEudk3yOXdJKYEJhe4Q@mail.gmail.com>
To: www-style list <www-style@w3.org>
Some additional technical details about the tokenizer that may be of interest.

The tokenizer uses 3 characters of lookahead.

It is *almost* stateless - if you implement it as a scanner that emits
one token per invocation, the only state it has to keep track of is a
single "reconsume" character and its current index into the
bytestream.  It always returns to the "data state" after emitting a
token, so the parsing algorithm can begin anew.  (If you're okay with
it sometimes returning multiple tokens, you can even drop the
"reconsume" character.)

Received on Thursday, 12 April 2012 05:37:48 UTC

This archive was generated by hypermail 2.3.1 : Monday, 2 May 2016 14:38:57 UTC