Re: Forget what I said about whitespace

> I think you misread the specification. As far as I can see, both
> browsers follow the CSS1 specification quite well..

"WHITESPACE and COMMENT tokens do not occur in the grammar (to keep it 
readable), but any number of these tokens may appear anywhere. The 
content of these tokens (the matched text) doesn't matter, but their 
presence or absence may change the interpretation of some part of the 
style sheet.  For example, in CSS2 the WHITESPACE is significant in 
selectors."

That to me would seem to imply that WHITESPACE is of NO significance 
inside declaration sets and that "red  green" is treated like 
"redgreen".  This would be consistant with the assumed effort in the 
standard to ensure that no property tokens can accidently form other 
tokens when in combination -- and the longest ones can always be 
checked first.

> So if you, as a CSS1 parser, encounter this, and you know who wrote
> it, please notify the author that there probably is an error in his
> style sheet, somewhere near "repeaturl"..

What you are implying with these statements is that the YACC style 
grammar presented in the back of the draft is not for convenience, 
but is actually the specification for the standard.  In which case it 
isn't necessarily consistant with the wording of the standard, and 
there were errors in the macros.
 
> What will you do if there is a keyword that is a prefix of another:
> say we add "greenish", will you parse that as "green" + "ish"? Or a
> more practical example: it is likely that we will have a
> pseudo-class ":first" in CSS2, will that cause your parser to forget
> about the pseudo-elements ":first-letter" and ":first-line"?

Standard parsing practice is to check the longest string first, 
additionally you are using selectors as an example but they appear to 
have a very clear syntax.
 
> We are aware that the significance of whitespace in the selectors
> makes parsing slightly harder, but there is nothing special about
> spaces on the right hand side. Like in most other languages, a token
> is always as long as possible. Thus "repeaturl" is only one
> identifier, and not two (or three, or four, or...)

What you are indicating again is that WHITESPACE does have 
significance in that it breaks apart tokens.  If this is the 
assumptiong made in YACC then it should also be clearly stated inside 
the CSS2 draft -- which currenlty only says that WHITESPACE only has 
significance in selectors.
 
> That does indeed mean that you may have to put in some spaces when
> you write out a style sheet. Butwhatismorenaturalthanthat?

The spaces may be convenient, but for the most part they are not 
necessary at all -- with the exception of the flawed font rule with 
respect to face names (which really makes the whole WHITESPACE tokens 
having no significance point flawed).

It is also interesting that CDO and CDC aren't just considered to be 
a part of the whitespace?

What I'm really saying is that the specification for the syntax rests 
on too many assumptions about existing tokenizers/lexers such as 
yacc/flex.

It is no real effort on our part to change our parser accordingly 
(which by the way we use a single parser that is not broken down 
into a tokenizer and lexer).

One other note (likely for clarity, but this small inconsistencies 
eventually add up to confusion) is that the grammar in 4.1.1 
doesn't use the previously stated TOKENS, and inside inserts single 
characters directly into the grammar.

One last point:
P { background: red; }
Since this incorrect declaration is so prevalent I think we should 
extend the syntax to allow a semicolon after the last property:value 
pair, otherwise the correct interpretation of this would not be to do 
anything to the background...
__
| Mortar: Advanced Web Development <http://mortar.bigpic.com/>
| Neil St.Laurent                  <mailto:stlaurent@bigpic.com>
| Big Picture Multimedia

Received on Wednesday, 17 December 1997 11:35:13 UTC