W3C home > Mailing lists > Public > www-style@w3.org > April 2012

[css3-values] inaccurate statements about syntax/grammar

From: Kang-Hao (Kenny) Lu <kennyluck@csail.mit.edu>
Date: Wed, 04 Apr 2012 18:15:18 +0800
Message-ID: <4F7C1F36.2000003@csail.mit.edu>
To: WWW Style <www-style@w3.org>
2.1. Component value types

  # All CSS properties also accept the keyword values ‘inherit’ and
  # ‘initial’ as their property value, but for readability these are
  # not listed explicitly in the property value syntax definitions.
  # These keywords cannot be combined with other component values in
  # same declaration; such a declaration is invalid. For example,
  #‘background: url(corner.png) no-repeat, inherit;’ is invalid.

The latter part starting with "These keywords..." is arguable false in
the following cases, depending on how "combined" and "component value"
are interpreted:

1a. 'font-family: xxx inherit;' is valid in all browsers
1b. 'font-family: inherit xx;' is valid in IE9 and Opera12alpha but not
others
1c. 'font-family: xxx, inherit;' is valid in all browsers except
Opera12alpha
1d. 'font-family: inherit, xxx;' is valid in IE9
2. 'content: attr(inherit);' is valid (testing on only WebKit and Firefox)
(Feel free to add more if you know anymore. Even if these exceptions are
too messy to put in the spec, it's probably not a bad idea if a complete
list is archived on www-style.)

I should mention that although I think normal people (which include me)
wouldn't really think 2. is invalid, I did think that 1a ~ 1d are
invalid after reading this sentence before actually doing the testing.

So, I think the latter sentence is a bit fragile and I suggest we make
it non-normative. The idea is that whether a property supports "partial
'inherit'" should depend solely on the "Value:" line and that property's
normative prose but not this fragile sentence. Proposed wording:

  | All CSS properties also accept the keyword values ‘inherit’ and
  | ‘initial’ as their sole property value. For example, the
  | "Value:" line of 'counter-increment' becomes '[ <identifier>
  | <integer>? ]+ | none | inherit | initial]' . For readability these
  | are not listed explicitly in the property value syntax definitions.
  |
  | Note: This implies that for most properties these keywords cannot
  | be combined with other component values in same declaration;such a
  | declaration is invalid. For example,‘background: url(corner.png)
  | no-repeat, inherit;’ is invalid.

And we leave 'font-family' to decide whether <family-name> accepts
"inherit" and how syntax ambiguity is resolved. I should note that the
spec for 'counter-increment's syntax already doesn't quite rely on this
fragile sentence because CSS 2.1 says:

  # The keywords 'none', 'inherit' and 'initial' must not be used as
  # counter names.

(modulo the sad fact that this sounds like a statement for authoring
conformance.)

and CSS3 V&U says:

  # This generic data type is denoted by <identifier>, and represents
  # any valid CSS identifier that does not otherwise appear as a
  # pre-defined keyword in that property's value definition.


2.3. Component value multipliers

  # Component values are specified in terms of tokens, as described in
  # Chapter 4 of [CSS21]. As the grammar allows spaces between tokens
  # in the components of the value production, spaces may appear
  # between tokens in property values.

The latter sentence is false at least in the following cases I know of:

1. Spaces are not allowed in-between the sign and the number. Namely
"'-'(DELIM) S NUMBER/PERCENTAGE/DIMENSION" are invalid token sequences
besides special situation like in a calc().
2. "'#'(DELIM) IDENT" in element(#id) and its possible generalization to
arbitrary selectors.
3. I would assume <id> used in nav-* to be in a similar situation to 2.
although it isn't quite clear.
(Feel free to add more if you know anymore. Even if these exceptions are
too messy to put in the spec, it's probably not a bad idea if a complete
list is archived on www-style.)

I'll note that this is a normative conformance requirement statement
without which you can't put values in between the component values, and
the "may" here seems odd. Proposed wording for a replacement for the
latter sentence:

  | Unless otherwise specified, UAs must ignore S tokens between other
  | tokens while matching a property value definition.

Note that

4. calc()

also falls into the "unless otherwise specified" because spaces are
required between a plus or minus.


4.1. Integers: the ‘<integer>’ type

  # Integer values are denoted by <integer>. An integer is one or more
  # decimal digits ‘0’ through ‘9’ and corresponds to a subset of
  # the NUMBER token in the grammar. Integers may be immediately
  # preceded by ‘-’ or ‘+’ to indicate the sign.

4.2. Numbers: the ‘<number>’ type

  # Number values are denoted by <number>. A number is either an
  # integer, or zero or more decimal digits followed by a dot (.)
  # followed by one or more decimal digits. It corresponds to the
  # NUMBER token in the grammar. Like integers, numbers may also be
  # immediately preceded by ‘-’ or ‘+’ to indicate the sign.

4.3. Percentages: the ‘<percentage>’ type

  # A percentage value is denoted by <percentage>, consists of a
  # <number> immediately followed by a percent sign ‘%’. It
  # corresponds to the PERCENTAGE token in the grammar.

To make it clear that <percentage> allows an optional sign (or that
<number> includes the sign), I suggest the following change:

from

  # An integer is one or more decimal digits ‘0’ through ‘9’ and
  # corresponds to a subset of the NUMBER token in the grammar.
  # Integers may be immediately preceded by ‘-’ or ‘+’ to indicate
  # the sign.

to

  | An integer is an optional sign(‘-’ or ‘+’) immediately followed
  | by one or more decimal digits ‘0’ through ‘9’ and roughly
  | corresponds to a subset of the NUMBER token in the grammar.

from

  # A number is either an integer, or zero or more decimal digits
  # followed by a dot (.) followed by one or more decimal digits. It
  # corresponds to the NUMBER token in the grammar. Like integers,
  # numbers may also be immediately preceded by ‘-’ or ‘+’ to
  # indicate the sign.

to

  | A number is either an integer, or an optional sign(‘-’ or ‘+’)
  | immediately followed by zero or more decimal digits followed by a
  | dot (.) followed by one or more decimal digits. It corresponds to
  | the NUMBER token in the grammar.

(Note that when you say "integers may be immediately preceded by ‘-’ or
‘+’", you seem to be making a difference between /integer/ and
<integer>, which doesn't seem desirable.)


I should note that the "immediately followed" here is different from
others in that this one separates two tokens. Namely, '+/**/1' is a
valid <integer> but '1/**/%' is not a valid <percentage>. I am not
asking for a change here because talking about COMMENT here might make
the prose unreadable, but I just want to mention this as potential place
that might confuse people.


8. Functional Notations

  # Some values use a functional notation to type values and to lump
  # values together. The syntax starts with the name of the function
  # immediately followed by a left parenthesis followed by optional
  # whitespace followed by the argument(s) to the notation followed by
  # optional whitespace followed by a right parenthesis. If a function
  # takes a list of arguments, the arguments are separated by a comma
  # (‘,’) with optional whitespace before and after the comma.

This paragraph didn't talk about optional whitespace within a single
argument (say, the space between wqname and <type> in attr()). Instead
of an incomplete description, I'll suggest not talking about whitespaces
as normative prose here. (It's already covered by the "UAs must ignore S
tokens between other tokens" or the original "spaces may appear between
tokens in property values" clause.) Proposed wording:

  | Some values use a functional notation to type values and to lump
  | values together. The syntax starts with the name of the function
  | immediately followed by a left parenthesis followed by the
  | argument(s) to the notation followed by a right parenthesis. If a
  | function takes a list of arguments, the arguments are separated by
  | a comma (‘,’).
  |
  | Note that by default spaces may appear between tokens.


Sorry about nitpicking and corner case examples.


Cheers,
Kenny
Received on Wednesday, 4 April 2012 10:15:53 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:20:52 GMT