W3C home > Mailing lists > Public > www-style@w3.org > January 2013

Re: [css3-syntax] Parser "entry points"

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Mon, 28 Jan 2013 11:05:06 -0800
Message-ID: <CAAWBYDA9OUCsSkeAZ2J7A=pXzuJO+N0Yza-5++faeV-zqVd-vQ@mail.gmail.com>
To: Simon Sapin <simon.sapin@kozea.fr>
Cc: www-style list <www-style@w3.org>
On Mon, Jan 28, 2013 at 8:56 AM, Simon Sapin <simon.sapin@kozea.fr> wrote:
> Le 28/01/2013 17:18, Tab Atkins Jr. a écrit :
>> On Mon, Jan 28, 2013 at 4:20 AM, Simon Sapin <simon.sapin@kozea.fr> wrote:
>>>
>>> data:text/html,<p style="color:red;};color:green">test
>>>
>>> Green in Firefox and IE, red in Chrome and Opera.
>>>
>>> [...]
>>
>>
>> Right.  WebKit parses it by just wrapping it in, iirc, "@-webkit-rule
>> {}" and parsing it as a stylesheet, then extracting the resulting
>> declarations.  That's why we stop on the } - it looks like it's
>> closing the at-rule.
>>
>> I know we just got two impls agreeing on this, which let us advance
>> the Style Attr spec, but still. :/  It's not a hard change to the
>> parser, it just's the only thing I know of so far that varies based on
>> entry point. (But see below, I guess.)
>
> I don’t think stopping or not stopping at } in a style attr is compat
> problem either way. Not stopping makes more sense IMO (there is no matching
> { token) but I would not object to stopping.

Given our efforts to get Style Attr past CR, I suspect the WG wouldn't
want to go back on it.  I'll just add a branch to the parser.


>>>>> Similarly, for a single declaration, a ; token does not end the
>>>>> declaration.
>>>>
>>>> What do you mean by "does not end the declaration"?  It looks like
>>>> top-level ; tokens aren't allowed in @supports conditions, and I don't
>>>> see how they'd be allowed anywhere else that wants to take a single
>>>> decl in the future.  I'd prefer to just say that it's a syntax error
>>>> if the decl is appended or unset before the token stream is fully
>>>> consumed.
>>>
>>> Ok, that would work too. But it’s still different from "append the
>>> declaration to the current rule and switch …" etc, so the state machine
>>> still has to be adapted.
>>
>> Yeah, you're right, it would need a parser change to work well.  Darn,
>> that's two instances, which makes it worthwhile to add the change.
>
> Or, as I said in another branch of the thread, just redefine "a single
> declaration that can not be made important" separately. It’s pretty simple:

Nah, I'd prefer branching the parser there.  Simpler.


>>> Hopefully, the syntax in selectors4 will be defined in terms of such
>>> primitives rather than have its own tokenizer.
>>
>> It already is.  No spec has ever tried to redo tokenization.
>
> http://www.w3.org/TR/css3-selectors/#lex defines a tokenizer that is not
> quite the same as CSS 2.1. It has no delim or unicode-range tokens but has
> PREFIXMATCH, COMMA, and a few others.

Oh geez, that's crazy.  And Selectors 4 keeps it, though we at least
have an issue that it needs to be updated.  Yeah, I'll kill that.


>>> Error handling in selectors is easy: the whole selector list is invalid.
>>> I’m
>>> not sure about media queries…
>>>
>>> data:text/html,<style>@media ], all{body{background:green
>>>
>>> (Green in Firefox, Opera and IE, not in Chrome.)
>>
>>
>> Chrome's wrong here - a syntax error in a MQ list just falsifies the
>> MQ, but leaves the rest of the list alone.
>>
>> I suppose I can do another parsing function, in addition to the "list
>> of primitives" one you outline above, which is more similar to
>> function parsing: break the list by top-level commas, and the value of
>> each entry is either a list of primitives or a syntax error.
>
> I don’t really like it, but another option is to make bad-string and bad-url
> preserved tokens so that ( { [ and function are the only non-preserved
> tokens, and "consume a primitive" never fails.
>
> It’s up to selectors, MQs, etc. to define their error handling for tokens
> such as bad-string, ] or cdo.
>
> Maybe it’s not such a bad idea after all.

This actually might be a good idea, as it means you don't have to
worry about url() tokenization rules when writing variables.

~TJ
Received on Monday, 28 January 2013 19:05:53 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:21:04 GMT