Re: [CSS21] WD 4.1.6, 4.2: parsing of blocks

On 1/7/11 2:40 PM, Peter Moulder wrote:
> The first paragraph of §4.1.6 claims that between the delimiting braces
> of a block "there may be any tokens, except that [brackets must be
> in nested matched pairs]".
>
> This conflicts with the grammar, which says that a block can't
> contain BAD_STRING, BAD_URI or BAD_COMMENT tokens (i.e. not just "any
> tokens")

It's not clear to me how one would even get any of those three tokens, 
given the end-of-stylesheet rules (which are applied before the 
tokenizer runs as far as I can tell; that's the only way they make any 
sense at all).  Am I missing something?

> Relatedly, the direction to "[observe] the rules for matching pairs of
> [bracketing and quotation characters]" is unclear on what should
> occur when encountering the wrong closing bracketing character:
> it isn't clear whether it should parse as if that closing bracketing
> character were removed, or as if extra closing bracketing characters
> were inserted.  For example, if the following illegal sequence counts
> as "while parsing a statement", it isn't clear whether the "end of the
> statement" occurs at the first or second ‘}’:
>
>    { ... ( ... } ... ) ... }

The second; I thought that was pretty clear.  Only the last open bracket 
can be closed, so if you see a close bracket that doesn't match, you 
just read it and don't close things.

> The following lines each contain a malformed statement, but it isn't
> clear where the malformed statement ends, and hence whether the
> p{color:blue} is to apply or not:
>
>    } p{color:blue}
>    }} p{color:blue}
>    }{} p{color:blue}

Just reading the spec without trying to read stuff into it, seems to me 
that the last one of those should apply; the previous two will be syntax 
errors due to failures to parse a selector.

For what it's worth, that's interoperably implemented in at least 
Presto, Gecko, and Webkit, so apparently there wasn't much of an 
understanding problem here at least on the part of implementors....

> Similar comments apply to the corresponding phrases "while parsing a
> declaration" and "end of the declaration", i.e. it isn't clear what
> those phrases mean.

Agreed.

> As another example, consider:
>
>    p { margin:0; color: red -->  ; }
>
> When we encounter the -->  token, it isn't clear whether we are in fact
> "parsing a statement" or "parsing a declaration"

"parsing a declaration".  Why is this unclear?

> The behaviour I see in gecko, konqueror and webkit is that it's
> like ‘p { margin:0; }’, even though one could reasonably argue that
> ‘color: red’ is not a malformed declaration and should not be
> discarded.

How is it not a malformed declaration?  Is the issue just that the sting 
"color: red -->" doesn't match the "declaration" production in the grammar?

I think the grammar for "declaration" is basically wrong, if that's 
what's concerning you here.

> One would have thought that encountering a malformed declaration would
> also mean that one has a malformed statement: certainly the token
> sequence wouldn't be a well-formed statement according to the grammar.

Fwiw, I think the upshot of all this is that the error-recovery rules 
are not really particularly compatible with having a grammar-based 
parser.  Which means I think we should give up on trying to pretend like 
we have a normative grammar which can be used for anything like actually 
tokenizing and parsing stylesheets.

-Boris

Received on Friday, 7 January 2011 21:59:15 UTC