W3C home > Mailing lists > Public > www-style@w3.org > May 2012

Re: [css21][css3-syntax] $foo in the core grammar (was: [css-variables] Using $foo as the syntax for variables)

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Wed, 23 May 2012 17:16:10 -0700
Message-ID: <CAAWBYDDb9Gb0WR43N=roefgRH7fBek4afswVsvbZNKggotFoAA@mail.gmail.com>
To: "Kang-Hao (Kenny) Lu" <kennyluck@csail.mit.edu>
Cc: Bjoern Hoehrmann <derhoermi@gmx.net>, WWW Style <www-style@w3.org>
On Mon, May 21, 2012 at 10:03 PM, Kang-Hao (Kenny) Lu
<kennyluck@csail.mit.edu> wrote:
> (12/05/22 9:14), Tab Atkins Jr. wrote:
>> [snip some theory about whether or not we should change the core grammar]
>> We should reject changes that would break non-trivial amounts of
>> existing content.  That's the only reasonable restriction that we can
>> operate under; anything else would mean that we're promoting
>> theoretical purity over improving the language for everyone else.
> While I more of less agree with the theory that changing for the better
> is a good thing, in this particular case, I disagree with the idea that
> putting $foo in the core grammar is actually "improving the language".
> In general, the effect of putting a prefixed identifier in the core
> grammar is that every time a character is tokenized, the tokenizer has
> to check to see if it is one of the prefixes and whether what follows is
> an identifier. This would mean that for fallback tokens like DELIM (i.e.
> ':', '{', '}', ';'), a redundant check to see if it is a '$' is needed.
> IMHO, redundant checks are bad because, well, it's the user's computer
> that runs this redundant check.

Yes, it basically means adding an additional case to the data state

If you don't handle that case in the tokenizer, you have to handle it
in the parser, so I don't see how there's an efficiency penalty.

I have no idea what you mean about fallback tokens.

> (12/05/22 5:30), Tab Atkins Jr. wrote:
>> Some further details - to handle $foo in the syntax, we'll either need
>> to add a VAR token to the grammar (defined identically to HASH but
>> with the $ character instead of #)
> Why identical to HASH but not ATKEYWORD? HASH needs {nmchar}+ becuase
> <color> needs it. Otherwise, nowhere in CSS allows an identifier to
> start with a number, including the ID selector:
>  # A CSS ID selector contains a "#" immediately followed by the ID
>  # value, which must be an identifier.
> (though I think this prose is quite crappy again in that it sounds like
> authoring conformance not UA conformance.)

Largely because it's just plain simpler. ^_^  Dealing with the
identifier rules is a pain in the ass.  If you can just immediately
switch into looking for nmchar, though, it's great!

>> or accept that variables show up in the tokenizer as a $ DELIM
>> followed by an IDENT.  The latter is suboptimal, though - it allows
>> comments between the $ and the foo, which sucks,
> Can you elaborate on why that sucks? Would anyone ever be confused by
> this? It seems like a theoretical concern to me.

Just because there's no reason a comment should go there.  We should
have been much more strict in where we allowed comments originally.
If possible, I think we should engineer future things to avoid oddly
placed comments.

>> and it means we have to deal with the "first character of
>> an IDENT" detail, despite there being no ambiguity (HASH gets to avoid
>> all that and just use "nmchar+").
> Can you elaborate? What is the "first character of IDENT" detail? What's
> wrong by simply saying that $foo is "DELIM followed by an IDENT" (and
> add a "without intermediate whitespace" to avoid confusion).
> I think HASH is a notorious example. Even if, for example, "#1st" is a
> HASH, you still can't use it as a ID selector (tested with WebKit and
> Firefox, not sure about others).

The first character of an ident can only be a dash or a
letter/non-ascii/escape/underscore.  If it's a dash, the second
character can only be a letter/non-ascii/escape/underscore; otherwise
it can be a nmchar (number/letter/non-ascii-escape/dash/underscore).
The rest of the characters can be nmchars.

> (So, please consider this an errata item:
> In Appendix G,
> change
>  # simple_selector
>  #  : element_name [ HASH | class | attrib | pseudo ]*
>  #  | [ HASH | class | attrib | pseudo ]+
>  #  ;
> to
>  |  /*
>  | * There is a constraint on the ID selector that the part after
>  | * "#" should match an IDENT; e.g., "#abc" is OK, but "#1st" is not.
>  | */
>  | simple_selector
>  |  : element_name [ HASH | class | attrib | pseudo ]*
>  |  | [ HASH | class | attrib | pseudo ]+
>  |  ;
> like the comment above hexcolor. This should go into selector3 or 4 too.)

That, or change the validity of id selectors, whichever is web-compatible.

Received on Thursday, 24 May 2012 00:17:00 UTC

This archive was generated by hypermail 2.4.0 : Monday, 23 January 2023 02:14:15 UTC