Re: [css3-values] inaccurate statements about syntax/grammar from Simon Sapin on 2012-04-04 (www-style@w3.org from April 2012)

From: Simon Sapin <simon.sapin@kozea.fr>
Date: Wed, 04 Apr 2012 15:47:26 +0200
To: "Kang-Hao (Kenny) Lu" <kennyluck@csail.mit.edu>
CC: WWW Style <www-style@w3.org>
Message-ID: <4F7C50EE.3000404@kozea.fr>

Le 04/04/2012 12:15, Kang-Hao (Kenny) Lu a écrit :
> 2.3. Component value multipliers
>
>    # Component values are specified in terms of tokens, as described in
>    # Chapter 4 of [CSS21]. As the grammar allows spaces between tokens
>    # in the components of the value production, spaces may appear
>    # between tokens in property values.
>
> The latter sentence is false at least in the following cases I know of:
>
> 1. Spaces are not allowed in-between the sign and the number. Namely
> "'-'(DELIM) S NUMBER/PERCENTAGE/DIMENSION" are invalid token sequences
> besides special situation like in a calc().

I think this is a flaw of the grammar. The sign should really be in the 
same token as the number it is for. More precisely, I suggest changing 
the num macro (used in the NUMBER, DIMENSION and PERCENTAGE tokens) from

    [0-9]+|[0-9]*\.[0-9]+

to

    [+-]?([0-9]+|[0-9]*\.[0-9]+)

I know that changing the core grammar is generally avoided, but this 
particular change solves several related issues.

I made this change in tinycss (soon to be announced) and, as far as I 
know, the only visible difference is that comments are not allowed 
between the sign and the number. Apparently Firefox does the same, or 
something similar:

http://lists.w3.org/Archives/Public/www-style/2011Oct/0030.html


> 2. "'#'(DELIM) IDENT" in element(#id) and its possible generalization to
> arbitrary selectors.

#foo parses as a single HASH token in both CSS 2.1 and Selectors 3, so I 
think there is no issue here. # foo would parse as DELIM S IDENT, it is 
not a valid ID selector or argument for element()


> 3. I would assume<id>  used in nav-* to be in a similar situation to 2.
> although it isn't quite clear.

(This is for in css3-ui, right?) css3-ui defines:
"The <id> value consists of a ‘#’ character followed by an identifier"

I think this should be changed to refer to ID selectors, like element() 
does. This would effectively means that <id> is a HASH token. This is 
not quite the same: the name in a HASH can start with a digit, while an 
identifier can not.


> (Feel free to add more if you know anymore. Even if these exceptions are
> too messy to put in the spec, it's probably not a bad idea if a complete
> list is archived on www-style.)

With a change to how numbers are parsed, I think that none of these 
three really have issues with white space. But I am also very interested 
to know: is there any other exception? Do we reserve the possibility to 
add such exceptions in the future? Or can a parser unconditionally 
remove white space tokens in property values? (That is, before any 
property-specific parsing.)


> I'll note that this is a normative conformance requirement statement
> without which you can't put values in between the component values, and
> the "may" here seems odd. Proposed wording for a replacement for the
> latter sentence:
>
>    | Unless otherwise specified, UAs must ignore S tokens between other
>    | tokens while matching a property value definition.


Agreed, without the "unless otherwise specified" part if:

1. The core grammar is changed as above
2. There is no such exception in current properties
3. We decide as a design principle for new properties not to make such 
exception.


Regards,
-- 
Simon Sapin

Received on Wednesday, 4 April 2012 13:47:59 UTC