Re: [css3-syntax] Critique of Feb 15 draft

Le 24/02/2013 05:32, Tab Atkins Jr. a écrit :
> In a string, a normal newline is invalid, but you can escape a newline
> to include it in the string.  I think it makes the spec a little
> simpler to collapse \r\n into a single newline char, so I don't have
> to push those details into the string states.  That's all.  I could
> simplify the preprocessing to only convert \r\n specifically into a
> \n, as it's not actually necessary to do anything with a lone \r.
> Thoughts?

Apparently the starting point was: "Why is \r converted but not \f?" How 
about doing one of these, for consistency?

* Not converting a lone \r. Newline pre-processing only affects \r\n.
* Or, also convert \f to \n.

(Of course, implementation are free not to have a separate 
pre-processing pass and do the same work inline in the tokenizer.)


>> >§4: The algorithm in this section might benefit from conversion to a
>> >recursive-descent presentation as was already done for the parser (§5).
> I'm not certain.  Switching to a RC parser was great for the Parsing
> section because those types of parsers are good for nested structures,
> which is most of CSS.  Tokens only*barely*  have a concept of nesting,
> in things like the fact that both types of strings have nearly the
> same internals, or url tokens contain strings.
>
> It would save some duplication, but I could probably also do that just
> by defining some more functions.

It’s clearly not recursive-descent since there is no recursion. Still, I 
think the "functions calling each other" style can be more readable than 
a state machine.

We don’t have to switch everything at once. Can we start by a "Consume a 
quoted string" function used for both string tokens and url tokens? 
Later we’ll discussed a "Consume an identifier" function. (Although hash 
tokens are a bit tricky.) Then some other part of the tokenizer. And 
before you know it, the state machine will not be needed anymore ;)

-- 
Simon Sapin

Received on Sunday, 24 February 2013 08:14:57 UTC