- From: Bert Bos <bert@w3.org>
- Date: Wed, 2 Mar 2011 20:09:18 +0100
- To: Mark <markjord@gmail.com>, www-style@w3.org
[Recorded as issue 206 http://wiki.csswg.org/spec/css2.1#issue-206] On Saturday 28 August 2010 18:09:27 Mark wrote: > Hello, > I'm implementing a CSS parser, and I've noticed some errors in the > grammar that don't appear to be documented in the CSS 2.1 > errata. They are to do with the hexcolor definitions. Note that it is probably a better idea to ignore appendix G and instead implement the generic grammar from chapter 4. There will be no hexcolor to worry about then. And that way your parser will also parse level 3 features. Depending on what you want to do with the parser, you will probably need individual routines to check each known property anyway, because both 'color: #777' and 'font: #777' are syntactically correct CSS, but the latter is not valid in level 2. > > In section 4.3.6 Colors > The format of an RGB value in hexadecimal notation is a '#' > immediately followed by either three or six hexadecimal characters. > > > In Appendix G. Grammar of CSS 2.1 > > /* > * There is a constraint on the color that it must > * have either 3 or 6 hex-digits (i.e., [0-9a-fA-F]) > * after the "#"; e.g., "#000" is OK, but "#abcd" is not. > */ > hexcolor > > : HASH S* > > ; > > "#"{name} {return HASH;} > name {nmchar}+ > nmchar [_a-z0-9-]|{nonascii}|{escape} > > > Now there are quite a few errors in the Appendix. > > 1. The grammar is case insensitive, so the comment shows a redundant > A-F. Yes, but it's written for humans. It's redundant, but not wrong, and it's probably safer to be a bit redundant in this case. Another option could have been to put the example in uppercase. > 2. nmchar is not defined as [0-9a-f], it's much less > restrictive allowing a whole host of non-hex characters to be > present. > 3. the definition for {name} appears to allow 1 or more hex digits > (not the 3 or 6 specified elsewhere) > 4. similar to comment 3, the other groups {nonascii}, {escape} will > also have invalid lengths > > So in summary, the grammar for hexcolor needs to be completely > separated from the grammar for HASH as there is no sensible reuse > possible here. It's a limitation of the chosen notation. Without adding context- dependency to the tokenizer, we cannot have at the same time a token for colors and a token for ID selectors. In the context of selectors, #fff is an ID selector, but in the context of certain declarations, it is a color. Also, CSS reserves the possibility that some property in the future accepts #foo12 as a value (to refer to an ID in the document, e.g.). I can even imagine some weird property that accepts both color and hash: 'id-to-color-map: #foo12 blue, #foo13 #fff, #foo14 red'. Such a thing doesn't exist in level 2, of course, but in the future the context for tokenizing colors might be hard to define... > > > I can also point out that the grammar could be made more consistent > if the trailing whitespace on 'hexcolor' and 'function' was moved > into the term block as follows:- > > term > > : unary_operator? > > [ NUMBER S* | PERCENTAGE S* | LENGTH S* | EMS S* | EXS S* | ANGLE > S* | TIME S* | FREQ S* ] > > | STRING S* | IDENT S* | URI S* | hexcolor S* | function S* > > function > > : FUNCTION S* expr ')' > > hexcolor > > : HASH Consistency is in the eye of the beholder. :-) The rule that the grammar follows is that S tokens are, whenever possible added after terminals rather than after non-terminals. But, as I said, please consider implementing the generic grammar from chapter 4 instead, unless you have a very good reason to ignore level 3 style sheets. Bert -- Bert Bos ( W 3 C ) http://www.w3.org/ http://www.w3.org/people/bos W3C/ERCIM bert@w3.org 2004 Rt des Lucioles / BP 93 +33 (0)4 92 38 76 92 06902 Sophia Antipolis Cedex, France
Received on Wednesday, 2 March 2011 19:09:49 UTC