W3C home > Mailing lists > Public > www-style@w3.org > March 2011

Re: [CSS21] Grammar Errors

From: Bert Bos <bert@w3.org>
Date: Tue, 22 Mar 2011 16:48:22 +0100
To: Mark <markjord@gmail.com>, www-style@w3.org
Message-Id: <201103221648.22794.bert@w3.org>
Hello Mark,

The CSS WG decided to make no changes to the grammar. Could you respond, 
this week if possible, if you cannot accept that decision?

(For reference, this issue is recorded as issue 206: 
http://wiki.csswg.org/spec/css2.1#issue-206)


On Wednesday 02 March 2011 20:09:18 Bert Bos wrote:
> [Recorded as issue 206 http://wiki.csswg.org/spec/css2.1#issue-206]
> 
> On Saturday 28 August 2010 18:09:27 Mark wrote:
> > Hello,
> > I'm implementing a CSS parser, and I've noticed some errors in the
> > grammar that don't appear to be documented in the CSS 2.1
> > errata. They are to do with the hexcolor definitions.
> 
> Note that it is probably a better idea to ignore appendix G and
> instead implement the generic grammar from chapter 4. There will be
> no hexcolor to worry about then. And that way your parser will also
> parse level 3 features.
> 
> Depending on what you want to do with the parser, you will probably
> need individual routines to check each known property anyway,
> because both 'color: #777' and 'font: #777' are syntactically
> correct CSS, but the latter is not valid in level 2.
> 
> > In section 4.3.6 Colors
> > The format of an RGB value in hexadecimal notation is a '#'
> > immediately followed by either three or six hexadecimal characters.
> > 
> > 
> > In Appendix G. Grammar of CSS 2.1
> > 
> > /*
> > 
> >  * There is a constraint on the color that it must
> >  * have either 3 or 6 hex-digits (i.e., [0-9a-fA-F])
> >  * after the "#"; e.g., "#000" is OK, but "#abcd" is not.
> >  */
> > 
> > hexcolor
> > 
> >  : HASH S*
> >  
> >  ;
> > 
> > "#"{name}               {return HASH;}
> > name            {nmchar}+
> > nmchar          [_a-z0-9-]|{nonascii}|{escape}
> > 
> > 
> > Now there are quite a few errors in the Appendix.
> > 
> > 1. The grammar is case insensitive, so the comment shows a
> > redundant A-F.
> 
> Yes, but it's written for humans. It's redundant, but not wrong, and
> it's probably safer to be a bit redundant in this case. Another
> option could have been to put the example in uppercase.
> 
> > 2. nmchar is not defined as [0-9a-f], it's much less
> > restrictive allowing a whole host of non-hex characters to be
> > present.
> > 3. the definition for {name} appears to allow 1 or more hex digits
> > (not the 3 or 6 specified elsewhere)
> > 4. similar to comment 3, the other groups {nonascii}, {escape} will
> > also have invalid lengths
> > 
> > So in summary, the grammar for hexcolor needs to be completely
> > separated from the grammar for HASH as there is no sensible reuse
> > possible here.
> 
> It's a limitation of the chosen notation. Without adding context-
> dependency to the tokenizer, we cannot have at the same time a token
> for colors and a token for ID selectors. In the context of
> selectors, #fff is an ID selector, but in the context of certain
> declarations, it is a color.
> 
> Also, CSS reserves the possibility that some property in the future
> accepts #foo12 as a value (to refer to an ID in the document, e.g.).
> I can even imagine some weird property that accepts both color and
> hash: 'id-to-color-map: #foo12 blue, #foo13 #fff, #foo14 red'. Such
> a thing doesn't exist in level 2, of course, but in the future the
> context for tokenizing colors might be hard to define...
> 
> > I can also point out that the grammar could be made more consistent
> > if the trailing whitespace on 'hexcolor' and 'function' was moved
> > into the term block as follows:-
> > 
> > term
> > 
> >  : unary_operator?
> >  : 
> >    [ NUMBER S* | PERCENTAGE S* | LENGTH S* | EMS S* | EXS S* |
> >    ANGLE
> > 
> > S* | TIME S* | FREQ S* ]
> > 
> >  | STRING S* | IDENT S* | URI S* | hexcolor S* | function S*
> > 
> > function
> > 
> >  : FUNCTION S* expr ')'
> > 
> > hexcolor
> > 
> >  : HASH
> 
> Consistency is in the eye of the beholder. :-) The rule that the
> grammar follows is that S tokens are, whenever possible added after
> terminals rather than after non-terminals.
> 
> But, as I said, please consider implementing the generic grammar from
> chapter 4 instead, unless you have a very good reason to ignore level
> 3 style sheets.



Bert
-- 
  Bert Bos                                ( W 3 C ) http://www.w3.org/
  http://www.w3.org/people/bos                               W3C/ERCIM
  bert@w3.org                             2004 Rt des Lucioles / BP 93
  +33 (0)4 92 38 76 92            06902 Sophia Antipolis Cedex, France
Received on Tuesday, 22 March 2011 15:48:56 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:20:38 GMT