- From: Tab Atkins Jr. <jackalmage@gmail.com>
- Date: Fri, 25 May 2012 15:23:34 -0700
- To: Sylvain Galineau <sylvaing@microsoft.com>
- Cc: www-style list <www-style@w3.org>
On Fri, May 25, 2012 at 3:14 PM, Sylvain Galineau <sylvaing@microsoft.com> wrote: > [Tab Atkins Jr.:] >> This is just a general preference question for implementors. >> >> When I pick up Syntax again, would it be better for me to write the >> parsing section as if it the tokenizing was already completely done, or >> interleaved with the tokenizing like the HTML parser is? What would be >> more useful? I can do either, but I'd rather not have to switch partway >> through, or even after I'm totally done. > > Not sure what you're asking. I think I'd rather implement a parsing algorithm > based on a clear and unambiguous definition of what the tokens actually are. > Recent threads suggest the latter may need some polishing? Tokens are easy - the tokenizer is already done in Syntax. The question is what to do with the parsing step, which turns tokens into actual CSS structures and values. HTML, for example, is defined with the tokenizing interleaved with the parsing. This is necessary for HTML, because some tags change the tokenizing rules - if you see a <script>, you stop parsing like HTML and instead just parse everything as text until you see the </script>, because what's betweent hose tags simply isn't HTML, and you don't want to risk screwing it up by parsing as HTML. CSS *does* technically have this in one circumstance - the an+b syntax of all the :nth-*() pseudos doesn't directly correspond to the tokens of CSS. However, it's possible to solve this by "reversing" the tokens into something more meaningful, so you don't technically have to switch contexts. So the question is simply, as someone implementing or maintaining a parser, which style is more useful to read? ~TJ
Received on Friday, 25 May 2012 22:24:23 UTC