- From: Kang-Hao (Kenny) Lu <kennyluck@csail.mit.edu>
- Date: Fri, 11 May 2012 20:52:21 +0800
- To: WWW Style <www-style@w3.org>
- CC: Andrei Polushin <polushin@gmail.com>
(12/05/10 17:37), L. David Baron wrote: > I think if we want to change this, we should just change the > dimension token throughout CSS I don't like this at first because it doesn't solve Nth parsing on the way, but after thinking for a while, I think this is quite nice because: - An argument[1] to require whitespaces around '+' and '-' was that we would want to add keywords to calc() in the future. This direction certainly avoids that problem at all. - We shouldn't cater to edge cases in Nth parsing. I was wondering if this will break contents on the Web because something like 'background-position: 10em-2em;' would start to work after this proposal, so I ran a grep against dotnetdotcom's web200904[2] and found that none of the pages has something like that. (Though admittedly, this collection consists of mostly HTML files and it would have been better if we had a public .css collection.) I'll list several options in this direction: A1. Part of Andrei Polushin's proposal[3] nmstart [_a-z]|{nonascii}|{escape} nmchar [_a-z0-9-]|{nonascii}|{escape} alpha [a-z]|{nonascii}|{escape} alnum [_a-z0-9]|{nonascii}|{escape} restrict {alpha}{alnum}* simple {nmstart}{nmchar}* prefixed [_-]{restrict}[-]{simple} unit {restrict}|{prefixed} %% {num}{unit} {return DIMENSION;} In other words, a dash allowed in the unit when there's *another* dash (for vendor prefix). I skipped the part for IDENT. A2. Simply change to {num}{alnum}* {return DIMENSION;} Of these two, I would say I like A2 better. B. Or we can be even more aggressive {num}{alpha}* {return DIMENSION;} This is obviously more dangerous in terms of breaking the Web. There are about 30 pages out of 600k pages in web200904 that have declarations like "padding: 0px0px;" and the like. I check almost all of them. Mostly are no longer accessible or fixed (well, this collection was made three years ago). Some of them have no effect whether they are successfully parsed or not. There is only one declaration that would be affected, but it is a 'MARGIN:1px3px' on a standalone element and has no visual difference whether it's parsed or not. The advantage of this is that CSS minimizer can be significantly benefited and it is also more consistent because you can now do 'padding: 10%10%' but not 'padding: 10px10px'. (12/05/10 17:37), L. David Baron wrote: > (rather than making the tokenizer context-sensitive, which is a huge > pain), (12/05/10 23:51), Tab Atkins Jr. wrote: > On Thu, May 10, 2012 at 11:27 AM, Kang-Hao (Kenny) Lu > <kennyluck@csail.mit.edu> wrote: >> 3. You cannot do tokenization and parsing as two passes as parsing >> calc() changes the sate of the tokenizer. >> For 3., it isn't a concern for Gecko as far as I can tell, but I >> don't about other browsers. > > I'm not sure what all browsers do, but at the very least it makes it > harder to spec. ^_^ I suspect that browsers probably generally use > integrated tokenizer/parsers, but simpler implementations that aren't > as perf-sensitive might use separate ones, as I think they're easier. For what it's worth, WebKit already does mode switching[4] stuff for Nth parsing (though admitted it also has a bunch of crazyness and I wouldn't be surprised if gets rewritten again* eventually), and I don't think it would be too difficult for Gecko too. But I agree that changing DIMENSION would be better. I only worry if people would say we don't want to touch the core grammar because it's been there for 10+ years. [1] http://lists.w3.org/Archives/Public/www-style/2009Apr/0005 [2] http://dotnetdotcom.org/#inde (This file was also used in research around the quriks mode document.) [3] http://lists.w3.org/Archives/Public/www-style/2008Mar/0179 [4] http://trac.webkit.org/browser/trunk/Source/WebCore/css/CSSParser.h?rev=116752#L390 * It was rewritten from a machine generated lexer to hand-coded one 3 months ago, it seems. Cheers, Kenny
Received on Friday, 11 May 2012 12:52:53 UTC