Re: 'legal' / 'important' and cascading order

Wolfgang Rieger writes:
 > The following example shows a problem connected with the use of
 > 'legal'or 'important' and the cascading order:
 > 
 > Example:
 > 
 > Rule 1:
 > 
 > spec1 { prop1 : value1 !important;
 > 	prop2 : value2; }
 > Rule 2:
 > 
 > spec2 { prop1 : value1;
 > 	prop2 : value2 !important; }

This is (like you said) equivalent to:

    spec1 { prop1 : value1 !important }
    spec1 { prop2 : value2 }
    spec2 { prop1 : value1 }
    spec2 { prop2 : value2 !important }

The way I would implement it is to store all rules sorted (or rather:
hashed) on property:

    property  head  rest    specificity  weight      source  values
    -------------------------------------------------------------
    prop1     ..  .- spec1  ...          !important  author  value1
              ..  |  spec2  ...          normal      author  value1
                  |
    prop2     ..  `- spec1  ...          normal      author  value2
              ..     spec2  ...          !important  author  value2

The vertical lines suggest that to save space the selectors can have
shared pointers. The `head' is the final part of the selector (the
element on which the property is set). The `rest' is the rest of the
selector (the parts before the element).

For each element encountered in the HTML source and for each property
that is examined for that element, the list of 5-tuples (head, rest of
selector, weight, source, value) will be traversed:

for each 5-tuple:

   0. if the head is unequal to the current HTML element, skip it
   1. if the weight is lower than the best match sofar, skip it
   2. if the source is lower (reader < author) than the best sofar, skip it
   3. if the specificity is lower than the best sofar, skip it
   4. otherwise, match the rest of the selector against the context
      if it matches, this becomes the best sofar.

There isn't much you can do to speed this up. 0, 1, 2, and 3 are
simple integer comparisons, 4 is the expensive part. Trying to sort by
weight, source and specifity may help avoid unnecessary steps 4, but
could easily cost more than it wins. (Unless you expect that selectors
will be long. We expect most of them to be short: one or two simple
selectors.)

However, see below for another way to avoid doing 4 too often.

 > Assume that spec1 and spec2 both apply to the element under
 > consideration. Now, when looking for the value of prop1, rule1 takes
 > precedence other rule2, while when looking for the value of prop2,
 > rule2 takes precedence over rule1.

Correct.

 > To handle this, one could use a data structure where simple selectors
 > are paired with declaration lists (i.e. groups of property-value
 > pairs). In this case, (in effect) one would have to sort lists of such
 > strutures with matching simple selectors each time one is looking for
 > a property.

I assume you mean `selector' rather than `simple selector'? According
to the grammar, the expression

    H1 EM, DIV P B, STRONG, H5 {...}

contains 4 selectors, with 2 + 3 + 1 + 1 = 7 simple selectors.

 > Or one could use simple selectors paired with property-value pairs.
 > This list would have to be sorted only once (according to the rules
 > given in section 3.2 of the draft). The first pair with matching
 > selector and property value would provide the property value.
 > 
 > However, there are two drawbacks with the second approach:
 > 
 > 1. To find the value for a given property, one would have to match a
 > list with all property-value pairs and matching selectors.
 > 2. If there is a rule with 10 simple selectors and 20 property-value
 > pairs, this rule alone produces a list with 200 items.

I guess you refer to a rule like:

    spec1, spec2, spec3, ..., spec10 {
        prop1: value1;
        prop2: value3;
        ...
        prop20: value20
    }

This would indeed mean the 200 rules:

    spec1 { prop1: value1 }    /* 1 */
    spec1 { prop2: value2 }    /* 2 */
    ...
    spec1 { prop20: value20 }  /* 20 */
    spec2 { prop1: value1 }    /* 21 */
    spec2 { prop2: value2 }    /* 22 */
    ...
    spec2 { prop20: value20 }  /* 40 */
    ...
    spec10 { prop20: value20 } /* 200 */

But if the data structure is able to share the memory for the
selectors, the abbreviated rule takes much less space, while still
being equivalent: each of the 10 selectors is stored only once, versus
10 times in the expanded case.

Moreover, using the ability of the parser and the datastructures to
share the selectors, one could also share the results of matching the
selectors (step 4 above): if a selector matched or failed to match,
the result is stored in the selector itself. The next time step 4 is
executed, one can see that the selector has already been matched and
use the result immediately.

Of course, when starting to process the next element from the HTML
source, all stored results will have to be invalidated. The easiest
way is to store a sequence number along with the result. The sequence
number would be incremented when a new HTML element is processed.

The data structure would be expanded to:

    property head rest specificity weight source values result seqno
    ----------------------------------------------------------------

 > Finally, one could have a list of property-value-pairs with
 > corresponding simple selectors. This list could be sorted once only,
 > too. But the matching the selectors would be relatively slow. This may
 > be no problem when formatting small HTML documents, but when
 > formatting large SGML documents this could be a performance issue.
 > 
 > I do not know what others think about this (and I would like to
 > learn). Maybe I'm just dumb and don't see the obvious.

There may well be a more efficient solutions than the one I
described. I think hashing (of property names and HTML elements) is
important, but I don't see much advantage in sorting anything.

Most of the properties will be inherited, rather than set
explicitly. Of the properties that are set explicitly, most will be
set with a one-part selector, that doesn't require step 4. At least
that is what we expect.

We are discussing our own implementation, but I would vote that we'll
wait a while before implementing the shared pointers. In theory it
should speed things up, but is it worth the extra complexity?


 > However, IMHO there is no big use in being able to specify weight on
 > the declaration level. As far as I can see, it would be sufficient to
 > specify weight on the rule level. In this case, one would have instead
 > of the example given above the following rules:
 > 
 > spec1 { prop1 : value1; !important }
 > spec1 { prop2 : value2; }
 > spec2 { prop1 : value1; }
 > spec2 { prop2 : value2; !important }

This is a different matter and independent of any processing
algorithm. If you use !important a lot, it makes sense to devise a
short-hand to apply it to a whole collection of rules at a time. If
!important is uncommon, it either doesn't matter, or it is easier to
specify it for each property individually.

We think !important will not be used a lot. It was introduced as the
answer for some specific situations: a legal requirement that text be
in a certain size, a color-blind reader putting !important colors in
his local default style.

Usually, most people will happily accept what the author has put in
the style sheet (or they will turn off the style sheet altogether...)
The user's local style sheet is unlikely to contain !important, so
there is little reason for the author to put it in.


Bert
-- 
  Bert Bos                                ( W 3 C ) http://www.w3.org/
  http://www.w3.org/pub/WWW/People/Bos/                      INRIA/W3C
  bert@w3.org                             2004 Rt des Lucioles / BP 93
  +33 93 65 77 71                 06902 Sophia Antipolis Cedex, France

Received on Monday, 22 April 1996 17:57:50 UTC