- From: Bert Bos <bert@w3.org>
- Date: Mon, 10 Aug 2009 18:03:04 +0200
- To: www-style@w3.org
- Cc: Yves Lafon <ylafon@w3.org>, Andrey Mikhalev <amikhal@abisoft.spb.ru>
- Message-Id: <200908101803.05437.bert@w3.org>
I've tried to improve the grammar of appendix G and I think I now have a version that defines the same language as before, but is LL(1). It has no ambiguities and no nullable non-terminals (except for the start symbol: "stylesheet" can of course still be empty). Compared to the last edits in response to Yves's suggestions, I've only further changed "selector" and "combinator". I'd like the grammar of appendix G to be as useful as possible, even though I know not many programs can use it. (Maybe it serves a validator, but all other programs will have to accept the forward compatible grammar instead.) I'd like to ask especially Andrey and Yves to take a look... I tested the grammar with an LL(1) parser generator and, after carefully expanding the rules, with Yacc. And it seems to work: neither complains about ambiguities and the resulting programs accept my various tests. Bert -- Bert Bos ( W 3 C ) http://www.w3.org/ http://www.w3.org/people/bos W3C/ERCIM bert@w3.org 2004 Rt des Lucioles / BP 93 +33 (0)4 92 38 76 92 06902 Sophia Antipolis Cedex, France
stylesheet : [ CHARSET_SYM STRING ';' ]? [S|CDO|CDC]* [ import [ CDO S* | CDC S* ]* ]* [ [ ruleset | media | page ] [ CDO S* | CDC S* ]* ]* ; import : IMPORT_SYM S* [STRING|URI] S* [ medium [ ',' S* medium]* ]? ';' S* ; media : MEDIA_SYM S* medium [ ',' S* medium ]* '{' S* ruleset* '}' S* ; medium : IDENT S* ; page : PAGE_SYM S* pseudo_page? '{' S* declaration? [ ';' S* declaration? ]* '}' S* ; pseudo_page : ':' IDENT S* ; operator : '/' S* | ',' S* ; combinator : '+' S* | '>' S* ; unary_operator : '-' | '+' ; property : IDENT S* ; ruleset : selector [ ',' S* selector ]* '{' S* declaration? [ ';' S* declaration? ]* '}' S* ; selector : simple_selector [ combinator selector | S+ [ combinator? selector ]? ]? ; simple_selector : element_name [ HASH | class | attrib | pseudo ]* | [ HASH | class | attrib | pseudo ]+ ; class : '.' IDENT ; element_name : IDENT | '*' ; attrib : '[' S* IDENT S* [ [ '=' | INCLUDES | DASHMATCH ] S* [ IDENT | STRING ] S* ]? ']' ; pseudo : ':' [ IDENT | FUNCTION S* [IDENT S*]? ')' ] ; declaration : property ':' S* expr prio? ; prio : IMPORTANT_SYM S* ; expr : term [ operator? term ]* ; term : unary_operator? [ NUMBER S* | PERCENTAGE S* | LENGTH S* | EMS S* | EXS S* | ANGLE S* | TIME S* | FREQ S* ] | STRING S* | IDENT S* | URI S* | hexcolor | function ; function : FUNCTION S* expr ')' S* ; /* * There is a constraint on the color that it must * have either 3 or 6 hex-digits (i.e., [0-9a-fA-F]) * after the "#"; e.g., "#000" is OK, but "#abcd" is not. */ hexcolor : HASH S* ;
%option case-insensitive h [0-9a-f] nonascii [\200-\377] unicode \\{h}{1,6}(\r\n|[ \t\r\n\f])? escape {unicode}|\\[^\r\n\f0-9a-f] nmstart [_a-z]|{nonascii}|{escape} nmchar [_a-z0-9-]|{nonascii}|{escape} string1 \"([^\n\r\f\\"]|\\{nl}|{escape})*\" string2 \'([^\n\r\f\\']|\\{nl}|{escape})*\' invalid1 \"([^\n\r\f\\"]|\\{nl}|{escape})* invalid2 \'([^\n\r\f\\']|\\{nl}|{escape})* comment \/\*[^*]*\*+([^/*][^*]*\*+)*\/ ident -?{nmstart}{nmchar}* name {nmchar}+ num [0-9]+|[0-9]*"."[0-9]+ string {string1}|{string2} invalid {invalid1}|{invalid2} url ([!#$%&*-~]|{nonascii}|{escape})* s [ \t\r\n\f]+ w {s}? nl \n|\r\n|\r|\f A a|\\0{0,4}(41|61)(\r\n|[ \t\r\n\f])? C c|\\0{0,4}(43|63)(\r\n|[ \t\r\n\f])? D d|\\0{0,4}(44|64)(\r\n|[ \t\r\n\f])? E e|\\0{0,4}(45|65)(\r\n|[ \t\r\n\f])? G g|\\0{0,4}(47|67)(\r\n|[ \t\r\n\f])?|\\g H h|\\0{0,4}(48|68)(\r\n|[ \t\r\n\f])?|\\h I i|\\0{0,4}(49|69)(\r\n|[ \t\r\n\f])?|\\i K k|\\0{0,4}(4b|6b)(\r\n|[ \t\r\n\f])?|\\k L l|\\0{0,4}(4c|6c)(\r\n|[ \t\r\n\f])?|\\l M m|\\0{0,4}(4d|6d)(\r\n|[ \t\r\n\f])?|\\m N n|\\0{0,4}(4e|6e)(\r\n|[ \t\r\n\f])?|\\n O o|\\0{0,4}(4f|6f)(\r\n|[ \t\r\n\f])?|\\o P p|\\0{0,4}(50|70)(\r\n|[ \t\r\n\f])?|\\p R r|\\0{0,4}(52|72)(\r\n|[ \t\r\n\f])?|\\r S s|\\0{0,4}(53|73)(\r\n|[ \t\r\n\f])?|\\s T t|\\0{0,4}(54|74)(\r\n|[ \t\r\n\f])?|\\t U u|\\0{0,4}(55|75)(\r\n|[ \t\r\n\f])?|\\u X x|\\0{0,4}(58|78)(\r\n|[ \t\r\n\f])?|\\x Z z|\\0{0,4}(5a|7a)(\r\n|[ \t\r\n\f])?|\\z %% {s} {return S;} \/\*[^*]*\*+([^/*][^*]*\*+)*\/ /* ignore comments */ "<!--" {return CDO;} "-->" {return CDC;} "~=" {return INCLUDES;} "|=" {return DASHMATCH;} {string} {return STRING;} {invalid} {return INVALID; /* unclosed string */} {ident} {return IDENT;} "#"{name} {return HASH;} @{I}{M}{P}{O}{R}{T} {return IMPORT_SYM;} @{P}{A}{G}{E} {return PAGE_SYM;} @{M}{E}{D}{I}{A} {return MEDIA_SYM;} "@charset " {return CHARSET_SYM;} "!"({w}|{comment})*{I}{M}{P}{O}{R}{T}{A}{N}{T} {return IMPORTANT_SYM;} {num}{E}{M} {return EMS;} {num}{E}{X} {return EXS;} {num}{P}{X} {return LENGTH;} {num}{C}{M} {return LENGTH;} {num}{M}{M} {return LENGTH;} {num}{I}{N} {return LENGTH;} {num}{P}{T} {return LENGTH;} {num}{P}{C} {return LENGTH;} {num}{D}{E}{G} {return ANGLE;} {num}{R}{A}{D} {return ANGLE;} {num}{G}{R}{A}{D} {return ANGLE;} {num}{M}{S} {return TIME;} {num}{S} {return TIME;} {num}{H}{Z} {return FREQ;} {num}{K}{H}{Z} {return FREQ;} {num}{ident} {return DIMENSION;} {num}% {return PERCENTAGE;} {num} {return NUMBER;} {U}{R}{L}"("{w}{string}{w}")" {return URI;} {U}{R}{L}"("{w}{url}{w}")" {return URI;} {ident}"(" {return FUNCTION;} . {return *yytext;}
Received on Monday, 10 August 2009 16:03:44 UTC