- From: Bert Bos <bert@w3.org>
- Date: Wed, 2 Mar 2011 20:09:18 +0100
- To: Mark <markjord@gmail.com>, www-style@w3.org
[Recorded as issue 206 http://wiki.csswg.org/spec/css2.1#issue-206]
On Saturday 28 August 2010 18:09:27 Mark wrote:
> Hello,
> I'm implementing a CSS parser, and I've noticed some errors in the
> grammar that don't appear to be documented in the CSS 2.1
> errata. They are to do with the hexcolor definitions.
Note that it is probably a better idea to ignore appendix G and instead
implement the generic grammar from chapter 4. There will be no hexcolor
to worry about then. And that way your parser will also parse level 3
features.
Depending on what you want to do with the parser, you will probably need
individual routines to check each known property anyway, because both
'color: #777' and 'font: #777' are syntactically correct CSS, but the
latter is not valid in level 2.
>
> In section 4.3.6 Colors
> The format of an RGB value in hexadecimal notation is a '#'
> immediately followed by either three or six hexadecimal characters.
>
>
> In Appendix G. Grammar of CSS 2.1
>
> /*
> * There is a constraint on the color that it must
> * have either 3 or 6 hex-digits (i.e., [0-9a-fA-F])
> * after the "#"; e.g., "#000" is OK, but "#abcd" is not.
> */
> hexcolor
>
> : HASH S*
>
> ;
>
> "#"{name} {return HASH;}
> name {nmchar}+
> nmchar [_a-z0-9-]|{nonascii}|{escape}
>
>
> Now there are quite a few errors in the Appendix.
>
> 1. The grammar is case insensitive, so the comment shows a redundant
> A-F.
Yes, but it's written for humans. It's redundant, but not wrong, and
it's probably safer to be a bit redundant in this case. Another option
could have been to put the example in uppercase.
> 2. nmchar is not defined as [0-9a-f], it's much less
> restrictive allowing a whole host of non-hex characters to be
> present.
> 3. the definition for {name} appears to allow 1 or more hex digits
> (not the 3 or 6 specified elsewhere)
> 4. similar to comment 3, the other groups {nonascii}, {escape} will
> also have invalid lengths
>
> So in summary, the grammar for hexcolor needs to be completely
> separated from the grammar for HASH as there is no sensible reuse
> possible here.
It's a limitation of the chosen notation. Without adding context-
dependency to the tokenizer, we cannot have at the same time a token for
colors and a token for ID selectors. In the context of selectors, #fff
is an ID selector, but in the context of certain declarations, it is a
color.
Also, CSS reserves the possibility that some property in the future
accepts #foo12 as a value (to refer to an ID in the document, e.g.). I
can even imagine some weird property that accepts both color and hash:
'id-to-color-map: #foo12 blue, #foo13 #fff, #foo14 red'. Such a thing
doesn't exist in level 2, of course, but in the future the context for
tokenizing colors might be hard to define...
>
>
> I can also point out that the grammar could be made more consistent
> if the trailing whitespace on 'hexcolor' and 'function' was moved
> into the term block as follows:-
>
> term
>
> : unary_operator?
>
> [ NUMBER S* | PERCENTAGE S* | LENGTH S* | EMS S* | EXS S* | ANGLE
> S* | TIME S* | FREQ S* ]
>
> | STRING S* | IDENT S* | URI S* | hexcolor S* | function S*
>
> function
>
> : FUNCTION S* expr ')'
>
> hexcolor
>
> : HASH
Consistency is in the eye of the beholder. :-) The rule that the grammar
follows is that S tokens are, whenever possible added after terminals
rather than after non-terminals.
But, as I said, please consider implementing the generic grammar from
chapter 4 instead, unless you have a very good reason to ignore level 3
style sheets.
Bert
--
Bert Bos ( W 3 C ) http://www.w3.org/
http://www.w3.org/people/bos W3C/ERCIM
bert@w3.org 2004 Rt des Lucioles / BP 93
+33 (0)4 92 38 76 92 06902 Sophia Antipolis Cedex, France
Received on Wednesday, 2 March 2011 19:09:49 UTC