[CSS21] Proposal for a replacement for section 17.2.1 (table anonymous objects)

I've recently been working on Gecko's support for section 17.2.1 of 
CSS2.1 (table anonymous objects), and I've run into a number of cases 
where behavior is underdefined, or not defined the way browsers actually 
implement it.  I've posted about some of these issues in the past.

I've decided to write up a proposal based on what I've ended up 
implementing in Gecko.

The technical issues I'm trying to address are the following:

1) Various underdefined behavior wrt whitespace handing,
    definitions of adjacent, and so forth.
2) The following testcase is interoperably rendered by IE8,
    Gecko, Webkit, and Presto in a way that does not match the
    current spec:

     <!DOCTYPE html>
     <body>
       <div style="display: table">
         <div style="display: table-cell">1</div>
         <div>2</div>
         <div style="display: table-cell">3</div>
       </div>
     </body>

    (per spec there should be three rows in that table,
    while implementations only have one).  My proposal aligns
    with implementations here.
3) Handling of out-of-flow children (whatever that means) of
    table-related boxes is not well defined.  Some experimentation
    shows that this testcase:

     <!DOCTYPE html>
     <body>
       <table>
         <tr>
           <td>One</td><td>Two</td><td>Three</td><td>Four</td>
         </tr>
         <tr>
           <td>1</td>
           <td style="position:absolute; top: 200px">2</td>
           <td style="position:absolute; top: 200px">3</td>
           <td>4</td>
         </tr>
       </table>
     </body>

    is rendered interoperably by IE8, Gecko, Presto, and Webkit
    in the following manner: the boxes containing "2" and "3"
    are positioned on top of each other 200px below the table, and
    the cell containing "4" is placed in the third column.  The
    behavior does not change if <span>s with CSS table display types
    are used in the testcase, and the above markup is more readable,
    so I used it for the example.  In IE7, the cell containing "4"
    is placed in the second column, not the third one.

    If, in the same testcase, I replace "position: absolute"
    with "float: left", then rendering is again interoperable in Webkit,
    Gecko, Presto, and IE8.  The "2" and "3" are both placed into the
    second column; the "4" is placed into the third column.  Again,
    changing to using <span>s with appropriate display values does not
    affect the behavior in terms of which things end up in which columns.
    Again, my proposal aligns with implementations on this issue.

With all that in mind, here is the proposal:

===================================================================
Definitions:

1) A non-replaced box is called "table-related-inside" if its value
    of the "display" property is one of 'table', 'inline-table',
    'table-row-group', 'table-header-group', 'table-footer-group',
    or 'table-row'.
2) A box is called "table-related-outside" if its value of the "display"
    property is one of 'table-row-group', 'table-header-group',
    'table-footer-group', 'table-row', 'table-column-group',
    'table-column, 'table-cell', or 'table-caption'.
3) Two boxes which have the same parent box and are both in-flow
    are "adjacent" if there are no boxes between them in the parent's
    in-flow list.  This is a little fuzzy, but the entire box tree
    definition is pretty fuzzy.  I can pin this down more in terms
    of Gecko's implementation, but that seems ... imperfect.  I
    welcome ideas on how to make this concept (which I think there's
    general agreement on amongst implementors) clearer.
4) A nonempty set of in-flow boxes is called "consecutive" if all
    elements of the set have the same parent, and if no box not in
    the set is adjacent to two different boxes in the set.
    The parent of a consecutive set of boxes is well-defined: it is
    the parent of any box in the set.
5) For a consecutive set of boxes S, we say "wrap set S in box B"
    to mean that box B is placed in the in-flow child list of the
    parent of S immediately before the first box belonging to S and
    then all boxes in S are removed from their parent's in-flow child
    list and inserted into B's in-flow child list.  This operation
    preserves the order of the boxes in S.
6) A box is a "valid table child" if it is a 'table-row-group',
    'table-header-group', 'table-footer-group', 'table-caption',
    'table-column-group', 'table-column', or 'table-row' box.
7) A box is "discardable" if it is generated by a Text node that
    contains only whitespace.  The value of the 'white-space'
    property does not affect whether a box is discardable.
8) To "discard" a box means to remove it from the box tree.  It, and any
    child boxes it might have, are not rendered.

Anonymous table objects are generated according to the following rules:

1) For every out-of-flow box T which would be a child of a
    table-related-inside box P if T were in-flow, insert an inline box
    with width and height 0 in the child list of P at the position
    where T would appear if it were in-flow.  If this inline box is
    ever discarded, also discard the out-of-flow box T.  The inline box
    is called the "placeholder" of T.
2) Discard all child boxes of 'table-column' boxes.
3) If a box T is not a 'table-column' box and has a parent P which is a
    'table-column-group' box, discard T.
4) If a box P that is not table-related-inside has an in-flow child box
    T that is table-related-outside, generate a box P'.  P' must be an
    'inline-table' box if P is 'inline' and a 'table' box otherwise.
    Take the maximal consecutive set of boxes which contains T and
    consists only of boxes that are table-related-outside and wrap this
    set in P'.
5) If a box P is a 'table' or 'inline-table' box and has a child T
    which is not a valid table child, generate a 'table-row' box P'.
    Take the maximal consecutive set of boxes which contains T and
    consists only of boxes that are not valid table child boxes, and
    wrap this set in P'.  If P' now has only one child and this child
    is discardable, discard P'.
6) If a box P is a 'table-row-group', 'table-header-group', or
    'table-footer-group' box and contains a child T which is not a
    'table-row' box, generate a 'table-row' box P'.  Take the maximal
    consecutive set of boxes which contains T and consists only of
    boxes that are not 'table-row' boxes, and wrap this set in P'.
    If P' now has only one child and this child is discardable,
    discard P'.
7) If a box P is a 'table-row' box and contains a child T which is
    not a 'table-cell' box, generate a 'table-cell' box P'.  Take
    the maximal consecutive set of boxes which contains T and
    consists only of boxes that are not 'table-cell' boxes, and
    wrap this set in P'. If P' now has only one child and this
    child is discardable, discard P'.  Otherwise, if P' now has an
    in-flow child box which is the placeholder for a box S and S
    has a computed value of 'float' other than 'none', then P'
    becomes the containing block os S.
8) Continue applying rules 4-7 until a pass through all four
    rules yields no changes to the box tree.

===================================================================

Note that rules 4-7 can actually be applied in any order desired, or 
even concurrently, since each rule can only affect the child lists of a 
certain set of boxes and the four sets are disjoint.

An alternate stopping condition for rule 8 is to keep applying rules 4-7 
until there are no "impedance mismatches" or whatever you want to call 
situations with missing boxes; one would then need to define that term. 
  I think the stopping condition used in the proposal above is simpler 
to state and sufficiently well-defined; in practice it's pretty clear 
when one is done with this algorithm.

I have posted some tests for the above algorithm at 
<http://test.csswg.org/svn/submitted/css2.1/tables/>.  The tests do pass 
in current Gecko development builds; all other browsers I have tried 
(including Gecko before I started working on this, of course) fail at 
least some of these, especially the ones that test dynamic changes or 
whitespace handling.  The same tests are available as HTML files one can 
just load at 
<http://web.mit.edu/bzbarsky/www/testcases/table-anonymous-objects/>. 
Sadly, that TLD is on IE8's "pretend to be IE7" list, so those wishing 
to try the tests in IE8 will have to copy them elsewhere.

I welcome any feedback on either the above proposal or the tests; I make 
no guarantees about either of them being bug-free, or about the tests 
correctly testing the rules above, though I did try to make sure of all 
of that.  I'm fairly certain the tests don't test the rules 
exhaustively, of course.

-Boris

Received on Saturday, 30 May 2009 07:00:29 UTC