[css-syntax] Moving away from a state machine (was: Ready for wide review, FPWD request coming soon)

Le 20/05/2013 08:04, Zack Weinberg a écrit :
> Attached is an edited version of the 17 May draft with the
> number/percentage/dimension tokenization rewritten as I had in mind.  It
> could probably stand a little polish, but I think this should give you
> an idea of what it would be like.  This reflects about four hours'
> effort and I think that was more than half of the work required to
> convert the whole scanner to this style.  (CSS numbers are messy.)
>
> I wound up partially converting identifier scanning as well, because it
> was convenient to have a "consume a sequence of name characters"
> subroutine to handle DIMENSIONs, and once you have that, using it for
> IDENT, HASH and AT-KEYWORD as well is trivial.

I am definitely in favor switching the tokenizer to the same style as 
the parser, that is a bunch of "functions" calling each other rather 
than a state machine. This is easier to think about, an can actually 
make the spec simpler by removing quite a bit of duplication:

* String and URL tokens can call the same "Consume a quoted string" 
function.

* Idents, hash, at-keywords and dimensions tokens can call the same 
"Consume a 'name'" function, where a name is made of "name characters" 
and escapes. (The restrictions on identifiers is checked before this 
function is called.)


> I suspect the table of contents and index are mangled.  Are they
> autogenerated somehow?

The source of the editor’s drafts are version-controlled in Mercurial 
and have a Git mirror:

https://dvcs.w3.org/hg/csswg/
https://github.com/w3c/csswg-drafts

Unfortunately, the generator is a Member-only web service:
http://wiki.csswg.org/tools/spec-processor

-- 
Simon Sapin

Received on Thursday, 30 May 2013 09:57:57 UTC