RE: [CSS2.1] Parsing Selectors with Brackets from Justin Rogers on 2007-12-26 (www-style@w3.org from December 2007)

From: Justin Rogers <justrog@microsoft.com>
Date: Wed, 26 Dec 2007 09:48:10 -0800
To: Anne van Kesteren <annevk@opera.com>, fantasai <fantasai.lists@inkedblade.net>, Bjoern Hoehrmann <derhoermi@gmx.net>, "www-style@w3.org" <www-style@w3.org>
Message-ID: <00BD06E707F60B4F9D6A3E75C712209D49A5F01F22@NA-EXMSG-C104.redmond.corp.microsoft>

[Anne van Kesteren wrote]
>> On Wed, 26 Dec 2007 06:38:11 +0100, fantasai
>> <fantasai.lists@inkedblade.net> wrote:
>>>   p { color: orange; }
>>>   p ( { color: red; } p { background: blue; } )
>>>
>>> Is the paragraph orange or blue?
>>
>> Didn't Björn answer that? I tested that in Opera, Firefox, and IE and
>> the second line got dropped in all of them leaving 'p { color: orange;
>> }'.
>
>My mistake, IE7 shows a blue background. But it was already established in
>a parallel thread that it has some minor issues with this.

I believe there is far too much in the specification that is just text and not enough describing the grammar... I would agree that I should pop onto the stack any VALID token and pair during error recovery any token since in the context of fallback there is not valid or invalid constructs, simply a token stack and a specific group of tokens I'm looking for to end my error scope.

There is nothing in complex_selector, simple_selector, etc... that allows for a bare parenthesis in the form of (... When I see this it starts my error fallback and is the error token. It is not obvious that I should pop it onto my stack and count it. Though various browsers have chosen this approach. Given this we should likely make it more clear in the specification (as noted in the other topic) precisely how the three special paired tokens are parsed and handled in all contexts and scopes.

Given my direct knowledge of IE, there are often times where we simply fallback to scanning for a particular token, such as the LPAREN, when parsing selectors. This allows us to be correct most of the time, but obviously fails for any complex scenarios. We can correct this type of difference if the spec becomes very clear or some consensus is reached on exactly how these tokens are to be handled.

That said here is some support for counting the tokens even in selectors:
1. Attribute selectors become parseable even by implementations that don't support it. In the selector you'll match the context stack and it will allow you to properly parse anything, even nesting scopes, out of the attribute selector's internals.
2. An unknown open function would count its open parenthesis on the stack allowing it to contain arbitrary junk, including LPARENS and forcing a match. (This is pretty close to what you have below only it is specified by the grammar as a valid token in selectors within the spec).

Error recovery in the parser can be extremely easy if we consider simple and concise rules in the error recovery section. There is no reason to be vague other than to leave options for implementation interpretation or to allow current implementations to be correct even if they don't follow the spec strictly.

Justin Rogers [MSFT]

Received on Wednesday, 26 December 2007 17:48:30 UTC