CSS3 Syntax and Grammar considerations

Hi,

   There is currently no public draft for the Syntax and Grammar
definition of CSS Level 3, but I do have some comments.
http://www.w3.org/Style/CSS/current-work lists David as supposed editor,
thus the CC:.

Notation:

   CSS Level 2 uses a lot of different notations, i.e. LL(1), YACC,
Flexx, Lex and a proprietary notation for valid property values are
mentioned/used. To ease human consumption of the specifications, I
strongly encourage the WG to reduce this to two different notations. The
property value notation is fine and authors who read former CSS
specifications a probably used to it. The formal specification of this
notation should go IMO into the Introduction module.

The formal CSS grammar should be written in numbered production rules
using the EBNF notation as defined in section 6 of XML 1.0. This
notation is way easier to read and having numbered production rules
makes it easier to reference them. The Syntax Module should be written
top-down defining and explaining the single tokens. The grammar
definitions would then look like e.g.

    [x] stylesheet ::= prolog (statement | misc)*
    [x] prolog     ::= charset? (at-rule | import-rule | misc)*

    [x] charset    ::= '@charset' ('"' EncName '"' | "'" EncName "'")
    [x] EncName    ::= [A-Za-z] ([A-Za-z0-9._] | '-')*

    [x] misc       ::= S | CDO | CDC | comment
    ...

or whatever, interspersed by explanatory text. Using this style, we end
up with a definit CSS Level 3 grammar. To insure forward compatibility
parsing the module should also provide a generic grammar without CSS
Level 2/3 restrictions (@charset must be the first token, @import only
preceding all rule sets, ...) but I suggest to provide a stricter
grammar then CSS Level 2 defines. I think CSS Level 2 is way to use, it
creates unnecessary complexity by allowing way too many constructs.

The formal grammar is a very serious issue and easy consumption is
important. The CSS Level 2 really isn't easy to read, there are

  * errors in the specification itself
  * mis-behaving implementations
  * editors who propose CSS Level 3 features
    that are not compatible with that grammar :-)
  * authors using invalid syntax
  * ...

I'm convinced that the IMO very straightforward EBNF notation may solve
some of these problems (e.g. it uses hexadecimal numbers quite common to
a broad range of people, while CSS uses octal numers) and I consider
this EBNF notation as "the way to go" since a lot of recent Technical
Reports also use this notation.

Character Model:

   There is a working draft named "Character Model for the World Wide
Web 1.0" [CHARMOD] that outlines requirements for W3C Technical reports.
The requirements given by this specification must be addressed. Some of
the requirements have to be addressed in other CSS3 modules, e.g. the
character encoding in URI references has to be addressed in the Values
and Units module, I've already commented on that.

RFC 2318 (registration of text/css) doesn't deal with character encoding
issues. The CSS3 module should not refer to any specific protocol but to
higher level protocols in general. Authors should be strongly encouraged
to specify the character encoding using higher level protocol
information. According to CA-2000-02 omitting such information is a
potential security risk. Transport via HTTP means for text/css to
default to ISO-8859-1 (as all documents of media type text), HTML4
forbids to assume any encoding, XML (through RFC 3023) suggests to
assume nothing but US-ASCII in this case. What about CSS? Is there any
default character encoding or should there be rules when the @charset
rule is mandatory?

The BOM:

   CSS Level 2 lacks of any reference whether the Byte Order Mark is
allowed or required for UCS-4, UTF-16 and UTF-8. I recommend to use the
same rules as in XML 1.0 here.

Syntax delegation:

   There are two standalone documents in the CSS3 module system that
define syntax and grammar, the W3C selectors module and the upcoming
module discussed here. The CSS3 Syntax module should delegate the formal
grammar definition of W3C selectors to this module, thus it depends on
the Selectors module. The rational to do so is the intent for the name
"W3C Selectors" in favour to "CSS Level 3 Selectors", it may
independently evolve and be used. I think I've been told that the Syntax
module should define all at-rules. I'm not happy with this decision and
if this won't happen, all modules defining at-rules have to define the
syntax of those rules (as the media queries module AFAIR) using the same
rules as the Syntax module. The syntax module would then probably depend
on those modules (a contra to my position). The rational is to define
syntax in place instead of splitting syntax and definition.

Authoring Tool requirements:

   CSS Level 2 only defines conformance for user agents and documents.
May it would be a good idea to list requirements for Authoring tools
like early W3C normalization of the style sheet as defined in [CHARMOD].
Maybe informative references to the Authoring Tool Accessability
Guidelines are a good idea...

SAC:

   I like the Simple API for CSS (SAC) and I think it would be nice to
have SAC3 as a separate CSS Level 3 module. However, since development
of SAC relies on interest, a good start would be give the SAC Note a
informative reference.

DOM:

   I've been told that the development of further HTML DOM
specifications is up to the HTML WG rather then the DOM WG. Is this also
the case for DOM Level 3 Style + CSS? I've certain CSS-DOM woes but I'm
still writing them up... However, if SAC is mentioned, the DOM should
probably also be mentioned.

I'm sure I forgot some issues, but I'll bug you with them later :-)

[CHARMOD] - http://www.w3.org/TR/charmod/
-- 
Björn Höhrmann { mailto:bjoern@hoehrmann.de } http://www.bjoernsworld.de
am Badedeich 7 } Telefon: +49(0)4667/981028 { http://bjoern.hoehrmann.de
25899 Dagebüll { PGP Pub. KeyID: 0xA4357E78 } http://www.learn.to/quote/

Received on Tuesday, 18 September 2001 23:08:30 UTC