Re: CSS3 Syntax and Grammar considerations

On Wed, 19 Sep 2001, Bjoern Hoehrmann wrote:
> Notation:
>
> The formal CSS grammar should be written in numbered production rules
> using the EBNF notation as defined in section 6 of XML 1.0. This
> notation is way easier to read and having numbered production rules
> makes it easier to reference them. The Syntax Module should be written
> top-down defining and explaining the single tokens. The grammar
> definitions would then look like e.g.

I was thinking about doing something like this, and, with it, combining
the two grammars (Chapter 4 and Appendix D).  However, I haven't had
time to work on it yet and I'd rather see a first public working draft
with what I have so far, without it.

One of the concerns I have about using EBNF is that it makes the
distinction between tokenization and parsing a bit less clear.  That
distinction is necessary to ensure that the grammar is LL(1), (e.g., so
that one can distinguish [att|="foo"] and [ns|att|="foo"] on the basis
of |= being a token distinct from | and =).  Perhaps that's not the
case, though -- "'|='" could indicate a |= token, and we would have a
set of token-level EBNF productions in a separate part of the grammar.

> preceding all rule sets, ...) but I suggest to provide a stricter
> grammar then CSS Level 2 defines. I think CSS Level 2 is way to use, it
> creates unnecessary complexity by allowing way too many constructs.

Why do you think the complexity CSS2 allows for future extensions is too
much?  It's not the hard part of implementing a parser, and we've
already run into cases where we've wanted extensions that don't fit with
the forward-compatible parsing rules.

> The BOM:
>
>    CSS Level 2 lacks of any reference whether the Byte Order Mark is
> allowed or required for UCS-4, UTF-16 and UTF-8. I recommend to use the
> same rules as in XML 1.0 here.

I had aready added comments allowing a BOM in the draft I'm working on.
I'll need to look into the XML rules about requiring it, and what the
character model draft says.  A BOM doesn't make sense for UTF-8 though.

> Syntax delegation:
>
>    There are two standalone documents in the CSS3 module system that
> define syntax and grammar, the W3C selectors module and the upcoming
> module discussed here. The CSS3 Syntax module should delegate the formal
> grammar definition of W3C selectors to this module, thus it depends on
> the Selectors module. The rational to do so is the intent for the name
> "W3C Selectors" in favour to "CSS Level 3 Selectors", it may
> independently evolve and be used. I think I've been told that the Syntax
> module should define all at-rules. I'm not happy with this decision and
> if this won't happen, all modules defining at-rules have to define the
> syntax of those rules (as the media queries module AFAIR) using the same
> rules as the Syntax module. The syntax module would then probably depend
> on those modules (a contra to my position). The rational is to define
> syntax in place instead of splitting syntax and definition.

It's hard for me to tell which side you're advocating here.  However, I
really don't find any of the possible solutions very elegant.  I suspect
the end result will be that selectors will be the exception rather than
the rule since the other modules aren't intended to stand alone as much
as selectors is, although I'm not really sure how much of an exception.

-David

-- 
L. David Baron        <URL: http://www.people.fas.harvard.edu/~dbaron/ >
Mozilla Contributor                      <URL: http://www.mozilla.org/ >
Invited Expert, W3C CSS WG          <URL: http://www.w3.org/Style/CSS/ >

Received on Wednesday, 19 September 2001 00:04:19 UTC