- From: Tab Atkins Jr. <jackalmage@gmail.com>
- Date: Sun, 24 Feb 2013 10:51:06 -0800
- To: Simon Sapin <simon.sapin@kozea.fr>
- Cc: Zack Weinberg <zackw@panix.com>, www-style list <www-style@w3.org>
On Sun, Feb 24, 2013 at 12:14 AM, Simon Sapin <simon.sapin@kozea.fr> wrote: > Le 24/02/2013 05:32, Tab Atkins Jr. a écrit : >> In a string, a normal newline is invalid, but you can escape a newline >> to include it in the string. I think it makes the spec a little >> simpler to collapse \r\n into a single newline char, so I don't have >> to push those details into the string states. That's all. I could >> simplify the preprocessing to only convert \r\n specifically into a >> \n, as it's not actually necessary to do anything with a lone \r. >> Thoughts? > > Apparently the starting point was: "Why is \r converted but not \f?" How > about doing one of these, for consistency? > > * Not converting a lone \r. Newline pre-processing only affects \r\n. > * Or, also convert \f to \n. > > (Of course, implementation are free not to have a separate pre-processing > pass and do the same work inline in the tokenizer.) I'm fine with doing either, with a preference toward the first bullet if I change anything. Zack, opinions? >>> >§4: The algorithm in this section might benefit from conversion to a >>> >recursive-descent presentation as was already done for the parser (§5). >> >> I'm not certain. Switching to a RC parser was great for the Parsing >> section because those types of parsers are good for nested structures, >> which is most of CSS. Tokens only*barely* have a concept of nesting, >> >> in things like the fact that both types of strings have nearly the >> same internals, or url tokens contain strings. >> >> It would save some duplication, but I could probably also do that just >> by defining some more functions. > > It’s clearly not recursive-descent since there is no recursion. Still, I > think the "functions calling each other" style can be more readable than a > state machine. > > We don’t have to switch everything at once. Can we start by a "Consume a > quoted string" function used for both string tokens and url tokens? Later > we’ll discussed a "Consume an identifier" function. (Although hash tokens > are a bit tricky.) Then some other part of the tokenizer. And before you > know it, the state machine will not be needed anymore ;) Yeah, I've come around to this. I'm starting with the recognition functions, but I'll add some consuming functions as I go along. ~TJ
Received on Sunday, 24 February 2013 18:51:52 UTC