Selectors module comments (CR-css3-selectors-20011113)

What follows is a commentary on the CSS3 Selectors module Candidate 
Recommendation: <http://www.w3.org/TR/2001/CR-css3-selectors-
20011113>.



3. Case sensitivity

For clarity, add that pseudo-class names and pseudo-element names 
are case-insensitive.

Are namespace prefixes case-sensitive?



4. Selector syntax

"A sequence of simple selectors is a chain of simple selectors that are 
not separated by a combinator."

I repeat my old complaint [1], which I feel was insufficiently addressed 
[2,3].  This sequence was called "simple selector" in CSS2 and in CSS1.  
The shifting terminology is confusing and unnecessary.

"A sequence of simple selectors is a chain of simple selectors that are 
not separated by a combinator. It always begins with a type selector or a 
universal selector. No other type selector or universal selector is 
allowed in the sequence."

This passage indicates that the selector ".class" is invalid.  Is the intent 
to invalidate such selectors?  If not, change "always" to "optionally".  If 
so, an explanation is due to account for the break from CSS1 and from 
CSS2, and the formal grammar (the 'simple_selector_sequence' 
production) needs revision.

"A simple selector is either a type selector, universal selector, attribute 
selector, ID selector, content selector, or pseudo-class."

This is a redefinition of the CSS2 and CSS1 term "simple selector", as 
previously mentioned [1].  If the Selectors module instead used one of 
the terms "selector particle", "selector atom", or "simple selector 
fragment", we could retain the standing definition of "simple selector".

"Combinators are: whitespace, '>', '+' and '~'."

The formal grammar (the 'combinator' production) indicates that the 
descendant combinator may be a series of one or more comments with 
no whitespace.  If the grammar is in error, correct it.

"The elements of the document tree represented by a selector are called 
subjects of the selector."

I still fail to understand why "represented" is more appropriate than 
"matched".  My request for an illustrative case [3] went unanswered, so I 
repeat the request.

"the subjects of a selector are always a subset of the elements 
represented by the rightmost sequence of simple selectors."

Again, I object to the term "rightmost", which should be "last" to 
eliminate indication of a particular visual presentation.  Since this 
wording was retained after Daniel Glazman committed to raising the 
issue with the Working Group [2], I assume that the Working Group 
explicitly moved to retain the wording.  Can somebody confirm or deny 
my assumption?



6.3.4 Default attribute values in DTDs

"Attribute selectors represent explicitly set attribute values in the 
document tree."

This seems to me a change from CSS2; is it?  If so, a note of it should 
go in section 1.1.



6.4 Class selectors

"The attribute value must immediately follow the "period" (.)."

This is a change from CSS2, which permitted comments to appear 
between the period and the class name.  Is the change intentional?

"p.pastoral.marine
It is fully identical to:
p.marine.pastoral"

Change "identical" to "equivalent".



The user action pseudo-classes :hover, :active, and :focus

"Selectors provides three pseudo-classes for the selection of an element 
the user is acting on."

Change "the user is acting on" to "on which the user is acting".

"The :hover pseudo-class applies while the user designates an element 
(with some pointing device), but does not activate it."

The stipulation that ':hover' does not apply during activation is a change 
from CSS2.  Can somebody explain the reason for the change?



6.6.2 The target pseudo-class :target

Change all five occurences of "URI" to "URI reference".

The discussion of URI is anyway beyond the scope of the Selectors 
module and should not be included.  One effect of discussing URI is the 
apparent exclusion of recognizing IDREF links.  So drop the URI 
discussion and merely mention that particular elements may be the 
targets of an inbound link.



6.6.6 Content pseudo-class

"The argument of this pseudo-class can be a string (surrounded by 
double quotes) or a keyword."

Change "keyword", which implies one of a limited, enumerated set, to 
"identifier".



7. Pseudo-elements

"Note: this :: notation is introduced by the current document in order to 
establish a discrimination between pseudo-classes and pseudo-
elements."

Why is the discrimination necessary?  If a user agent recognizes the 
keyword, the user agent knows whether the keyword notes a pseudo-
class or pseudo-element.  If a user agent fails to recognize the keyword, 
the selector is invalid and the user agent must ignore it.

"Pseudo-elements may only appear once in the sequence of simple 
selectors that represents the subjects of the selector."

Change to "Only one pseudo-elements may appear in a selector and 
then only at the end of the selector."



7.1 The ::first-line pseudo-element

"The ::first-line pseudo-element can only be attached to a block-level 
element."

Add "However, a selector containing the ::first-line pseudo-element 
selector is valid even if the subject element is not block-level."



8.1 Descendant combinator

"A selector of the form "A B" represents an element B that is an arbitrary 
descendant of some ancestor element A."

Change "an arbitrary descendant" to "a descendant at arbitrary depth".

"The following selector:
div * p
represents a p element that is a grandchild or later descendant of a div 
element."

Change "later" to "deeper".



9. Calculating a selector's specificity

"Concatenating the three numbers a-b-c (in a number system with a 
large base) gives the specificity."

Why is the Working Group retaining a wording that has proven its power 
to confuse people [4]?  There is no need to concatenate numbers or 
deal with queer numerical bases if we just note specificity as a triplet, (a, 
b, c).



10.1 Grammar

"selectors_group
  : selector [ ',' S* selector ]*"

This production and its associated productions forbid whitespace 
between a selector and a following comma.  I have previously 
mentioned this point [1].

"/* sequence ; only pseudo-element may occur */"

Change to "/* sequence; only one pseudo-element may occur */".

"simple_selector_sequence
  /* the universal selector is optional */
  : [ type_selector | universal ]?
        [ HASH | class | attrib | pseudo_class | negation ]+"

There is no 'negation' production.  I have previously mentioned this 
point [1].



10.2 Lexical scanner

"The two occurrences of "\377" represent the highest character number 
that current versions of Flex can deal with (decimal 255)."

English: change "deal with" to "handle".

"They should be read as "\4177777" (decimal 1114111), which is the 
highest possible code point in Unicode/ISO-10646."

Change "Unicode/ISO-10646" to "Unicode".  The highest code point in 
ISO 10646 is decimal 2147483647, which is octal 17777777777, which 
is hexadecimal 7FFFFFFF.

"(return PREFIXMATCH;)"
"(return SUFFIXMATCH;)"
"(return SUBSTRINGMATCH;)"

The parentheses should be curly braces.  I have previously mentioned 
this point [1].

"{integer]"

The square bracket should be a curly brace.  I have previously 
mentioned this point [1].

How would the lexer ever return a 'SIGNED_INTEGER' token starting 
with "-" or return an 'INTEGER' token?  The notation seems to show that 
'NUMBER' tokens would eliminate such possibilities.


I have rewritten the grammar and lexer to be more explicit, more 
consistent, and more correct.  Both follow.

selectors_group
  : selector [ ',' b* selector ]*
  ;

b
  : [ S | C ]
  ;

selector
  : [ simple_selector_sequence combinator ]*
       simple_selector_sequence [ pseudo_element ]? b*
  ;

combinator
  : b* [ '+' | '>' | '~' | S ] b*
  ;

simple_selector_sequence
  : [ [ type_selector | universal] C* ]?
    [ [ HASH | class | attrib | negation | pseudo_class ]
      C* ]+ |
    [ type_selector | universal ] C*
  ;

type_selector
  : [ namespace_prefix ]? element_name
  ;

namespace_prefix
  : [ [ IDENT | '*' ] C* ]? '|'  C*
  ;

element_name
  : IDENT
  ;

universal
  : [ namespace_prefix ]? '*'
  ;

class
  : '.' IDENT
  ;

attrib
  : '[' b* [ namespace_prefix ]? IDENT b*
        [ [ PREFIXMATCH |
            SUFFIXMATCH |
            SUBSTRINGMATCH |
            '=' |
            INCLUDES |
            DASHMATCH ] b* [ IDENT | STRING ] b*
        ]? ']'
  ;

pseudo_class
  : ':' C* [ IDENT | functional_pseudo ]
  ;

functional_pseudo
  : FUNCTION b* [ IDENT | STRING |
                  number | positions ] b* ')'
  ;

number
  : unary_operator? [ NUMBER | WHOLE ]
  ;

unary_operator
  : '+' | '-'
  ;

positions
  :  unary_operator? 
     [ MULTIPLE [ b* unary_operator b* WHOLE ] |
       'n' | WHOLE ]
  ;

negation
  : 'not(' b* negation_arg b* ')'
  ;

negation_arg
  : type_selector | universal | HASH | class | attrib | pseudo_class
  ;

pseudo_element
  : [ ':' ]? ':' IDENT
  ;



%option case-insensitive

h                       [0-9a-f]
nonascii                [\200-\377]
unicode                 \\{h}{1,6}(\r\n|[ \t\r\n\f])?
escape                  {unicode}|\\[ -~\200-\377]
nmstart                 [a-z_]|{nonascii}|{escape}
nmchar                  [a-z0-9-_]|{nonascii}|{escape}
string1                 \"([\t !#$%&(-~]|\\{nl}|\'|{nonascii}|{escape})*\"
string2                 \'([\t !#$%&(-~]|\\{nl}|\"|{nonascii}|{escape})*\'

ident                   {nmstart}{nmchar}*
name                    {nmchar}+
whole                   [0-9]+
num                     [0-9]*"."[0-9]+
string                  {string1}|{string2}
nl                      \n|\r\n|\r|\f
%%

[ \t\r\n\f]+    {return S;}

\/\*[^*]*\*+([^/][^*]*\*+)*\/   {return C;}

"~="                    {return INCLUDES;}
"|="                    {return DASHMATCH;}
"^="                    (return PREFIXMATCH;)
"$="                    (return SUFFIXMATCH;)
"*="                    (return SUBSTRINGMATCH;)
{string}                {return STRING;}
{ident}                 {return IDENT;}
{ident}"("              {return FUNCTION;}
{whole}"n"              {return MULTIPLE;}
{whole}                 {return WHOLE;}
{num}                   {return NUMBER;}
"#"{name}               {return HASH;}

.                       {return *yytext;}



11. Namespaces and Down-Level Clients

"it is impossible to construct a CSS style sheet that will function properly 
against all elements in those documents, unless, the style sheet is 
written using a namespace URI syntax"

Eliminate the commas around "unless".




[1] Etan Wexler. "CSS3 selectors critique (WD-css3-selectors-
20010126)".
<mid:20010822210341.1BF1014F4F4@server11.safepages.com>,
<http://lists.w3.org/Archives/Public/www-style/2001Aug/0075.html>.

[2] Daniel Glazman. "Re: CSS3 selectors critique (WD-css3-selectors-
20010126)".
<mid:3B8B0545.5020404@netscape.com>,
<http://lists.w3.org/Archives/Public/www-style/2001Aug/0082.html>.

[3] Etan Wexler. "Re: CSS3 selectors critique (WD-css3-selectors-
20010126)".
<20010830024825.F048E3C3B3@server10.safepages.com>,
<http://lists.w3.org/Archives/Public/www-style/2001Aug/0084.html>.

[4] Matthew Brealey.  "RichInStyle.com bug guide - Mozilla 5 & 
Netscape 6".
<http://www.richinstyle.com/bugs/mozilla.html#id4>.-- 
Etan Wexler <mailto:ewexler@stickdog.com>

Received on Wednesday, 8 May 2002 04:00:37 UTC