W3C home > Mailing lists > Public > www-style@w3.org > May 2012

Re: [css3-syntax] Preference for parser speccing?

From: Kang-Hao (Kenny) Lu <kennyluck@csail.mit.edu>
Date: Fri, 01 Jun 2012 03:43:24 +0800
Message-ID: <4FC7C9DC.8020200@csail.mit.edu>
To: "Tab Atkins Jr." <jackalmage@gmail.com>
CC: WWW Style <www-style@w3.org>
(12/05/26 6:08), Tab Atkins Jr. wrote:
> This is just a general preference question for implementors.
> When I pick up Syntax again, would it be better for me to write the
> parsing section as if it the tokenizing was already completely done,
> or interleaved with the tokenizing like the HTML parser is?  What
> would be more useful?  I can do either, but I'd rather not have to
> switch partway through, or even after I'm totally done.

Mind sharing an example about the choices here? Or is this a general
survey about how we should write the parser section?

(12/05/26 6:23), Tab Atkins Jr. wrote:
> HTML, for example, is defined with the tokenizing interleaved with the
> parsing.  This is necessary for HTML, because some tags change the
> tokenizing rules - if you see a <script>, you stop parsing like HTML
> and instead just parse everything as text until you see the </script>,

Note that when you say "interleaved" here, the tokenizing section of the
HTML spec uses phrases like "Emit the XXX token" instead of "call parser
routine YYY" so it's still quite clean to me (though whether that will
make people think that the parser wouldn't change the state of the
tokenizer is another issue).

It is true that the HTML parser changes the state of the tokenizer from
time to time during tree construction, but given that the CSS parser
mostly (except :nth-*) doesn't change the state of the tokenizer, I
can't quite imagine how you could write this in an interleaved way
without a concrete example...

Received on Thursday, 31 May 2012 19:44:11 UTC

This archive was generated by hypermail 2.4.0 : Friday, 25 March 2022 10:08:17 UTC