Re: [css3-syntax] First draft of parser section completed from Simon Sapin on 2012-06-09 (www-style@w3.org from June 2012)

From: Simon Sapin <simon.sapin@kozea.fr>
Date: Sat, 09 Jun 2012 11:25:19 +0200
To: www-style@w3.org
Message-ID: <4FD3167F.4050206@kozea.fr>
Le 09/06/2012 06:01, Kang-Hao (Kenny) Lu a écrit :
> 2. Instead of describing open-* (or close-*) token, it might be more
> readable just to use the character literally like '(', '{' and so on.
> Also, it might be more readable to fold, say, open-* in three lines in
> to a single line: '(', '{', '['.

Agreed. I never remember which is a bracket or a brace in English. 
Apparently the "correct" naming is not obvious at all:
https://en.wikipedia.org/wiki/%28#List_of_types

This is made worse by the current state of the draft. Maybe a 
search-and-replace went wrong?

> The output of the tokenization step is [...] open-bracket,
> close-bracket, open-paren, close-paren, open-bracket, close-bracket.

> 3.4.3. Data state
> U+005D RIGHT SQUARE BRACKET (])
>     Emit a close-bracket token. Remain in this state.
> U+007D RIGHT CURLY BRACKET (})
>     Emit a close-bracket token. Remain in this state.


This might be why the open-brace token appears twice in at-rule mode 
(switch to at-rule-block mode & Consume a block.)


EOF in blocks
============

In "Consume a block":
> EOF token
>     Return nothing.

This does not match the "Unexpected end of style sheet" rule of CSS 2.1. 
Instead, EOF should be the same as finding the ending token (close and 
return the block normally, unless an error was previously found.)


!important
==========

Where should be !important be parsed? I think it should be in or near 
"Declaration-value mode". But then what about descriptors? (see below)


Issue 4
=======

{} blocks can contains a variety of things:

* @font-face: descriptor declarations (name ':' value)
* style rules: property declarations (descriptors with an optional 
!important)
* @page: at-rules mixed with property declarations
* @media: any statement
* @region: style rules
* @keyframes, future modules: something else

I think that Syntax3 should not make any assumption on the content of 
at-rules and do something like this instead:

The output of Syntax3’s tree construction can contain "unparsed 
at-rules" with each:

* an at-keyword
* a "head" (everything after the at-keyword and before ';' or '{')
* an optional "body".

The body is missing if the at-rule ends with ';'. It is the content of 
the {} block otherwise. The head and body (if any) are both a mixed list 
of tokens, functions and blocks. (There is no [](){} of function tokens: 
these have been turned into nested function/block objects, or have 
triggered a nesting error.)

There is not guarantee on the shape of the at-rule’s content except that 
is is "well-formed" according to the 'at-rule' production of the core 
grammar. (ie. no nesting error, no cdo or cdc token, ...)

Only then, each CSS module (or wherever a particular at-rule is defined) 
would have its own parser for the head and body of an at-rule.

Syntax3 can have pre-defined procedures (like "parse a sequence of 
descriptors") that can be used by the at-rule parsers. These can then be 
very simple (@font-face) or more complex (@keyframes) as needed.


-- 
Simon Sapin
Received on Saturday, 9 June 2012 09:25:49 UTC