- From: Tab Atkins Jr. <jackalmage@gmail.com>
- Date: Fri, 1 Feb 2013 12:43:31 -0800
- To: Simon Sapin <simon.sapin@kozea.fr>
- Cc: www-style list <www-style@w3.org>
On Thu, Jan 31, 2013 at 11:44 PM, Simon Sapin <simon.sapin@kozea.fr> wrote: > Le 01/02/2013 03:30, Tab Atkins Jr. a écrit : >> I added a small algorithm for parsing an+b values at >> <http://dev.w3.org/csswg/css3-syntax/#parse-anb-notation>. It's just >> a "turn everything back into a string and reparse" algorithm. >> >> Does it look good? Alternately, I could do it by looking at the >> original tokens, it's just a bit messier that way. There's 6 >> different ways a valid an+b can be tokenized, and one of them involves >> a dimension token with a unit matching /n[+-]\d+/. > > > It’s very good to have this in Syntax. Thanks! > > "Turn tokens back to a string and reparse" is not pretty, but as you say > parsing from tokens is worse. And more likely to have some corner cases > wrong. Luckily the official grammar in Selectors, while wrong, is at least very simple and limited. I think I've got all the token-based parsing cases mapped out. But still, reparsing from a string is easier. > Is the an+b notation used in anything other than :nth-child() and related > Selectors? So far, no. But we might use it elsewhere in the future. > The various algorithms in §6. Parser Entry Points are assumed to parse from > text. But for Selectors at least, an+b is in the arguments of an > already-tokenized function so the input is a list of component values, not > text. Although that does not make much difference as all the relevant tokens > are preserved. I tried to wordsmith so that they all really just assume a list of tokens. If you have suggestions about how to phrase the intro better, I'd appreciate it. You're right that the an+b is likely already going to be done with a list of component values, but as you point out, it doesn't matter for the algorithm's purposes. I think I might keep it somewhat vague here, so that it's valid to invoke it either normally or with a normal token stream. > On to the algorithm itself: > > Dimension tokens should also append their unit to the string. The > "representation" is only that of the numeric part. Ooh, thanks. Fixed. Also, idents have a value, not a representation. I should maybe fix these, as I've made this mistake before. :/ > All whitespace tokens are ignored. This is not the case in the "nth" grammar > from Selectors 3. In particular, no whitespace is allowed between a and its > sign, nor between a and n. (Whitespace *is* allowed around the +/- sign > after n, and around the whole an+b sequence.) > > nth > : S* [ ['-'|'+']? INTEGER? {N} [ S* ['-'|'+'] S* INTEGER ]? | > ['-'|'+']? INTEGER | {O}{D}{D} | {E}{V}{E}{N} ] S* > ; Damn, you're right. Hmm. I was *really* trying to avoid having to deal with spaces between the signs and the numbers. I suppose I can do whitespace-stripping late, so I can check the step and fail it if there is any trailing whitespace. I'll look into how I want to do this. > For odd/even, the phrase used is "If repr contains an ASCII case-insensitive > match for …" It should be "is" instead of "contains". :nth-child(Some > oddities) should not match. ("Contains" is used that way later in the > algorithm.) Fixed, thanks. ~TJ
Received on Friday, 1 February 2013 20:44:17 UTC