[whatwg] Table integrity and conformance

On Oct 23, 2006, at 21:31, Ian Hickson wrote:

> I think it would be good to require table integrity. Specifically I  
> think
> overlapping cells would be a MUST NOT.

Made it an error.

> I don't think there's a problem
> with missing table cells at the end of rows (i.e. a ragged table is  
> fine).

Made it a warning if the width wasn't established by cols/colgroups.

> I think cells extending (via colspan/rowspan) into columns or rows  
> that
> contain no cells other than extended cells should be at least a SHOULD
> NOT, maybe a MUST NOT.

Made it an error to have a row or column that doesn't have at least  
one cell beginning in it.

> I don't know whether we'll keep the requirement that TFOOTs be above
> TBODYs. COLGROUP span="" will be illegal if the COLGROUP has COLs.

Those are easy to handle in RELAX NG.

To make the table integrity checker applicable to HTML 4.01, the  
COLGROUP span="" thing is only a warning, but the schema layer makes  
it an error for (X)HTML5.

Do you mean moving all TFOOTs after TBODYs, so that the HTML 4.01  
placement would be forbidden?

Currently, the RELAX NG-enforced content model for TABLE is
		(	caption.elem?
		,	( colgroup.elem+ | col.elem+)?
		,	(	( thead.elem?, tfoot.elem?, tbody.elem+ )
			|	( tr.elem+ )
			)
		)

The table integrity checker itself doesn't care if colgroups and cols  
appear side-by-side, if row groups don't have any rows, what the  
order of thead/tfoot/tbody is or if explicit and implicit row groups  
are mixed.

(Moreover, the table integrity checker only sees a projection of the  
document tree that contains nothing but table-significant elements  
and crazy subtrees of table-significant elements in wrong places are  
silently pruned, so the checker needs a sane schema to keep random  
stuff out of places where it would bother browsers.)

> COLs
> and COLGROUPs will probably have a SHOULD NOT requirement about  
> spanning
> into columns with no actual cells.

Made it an error. Also made it an error to have rows that exceed the  
width established by column markup.

> And the SHOULD NOTs will probably be MUST NOTs.

I only have errors and warnings. If it makes the doc non-conforming,  
it is an error. If it doesn't make the doc non-conforming but the  
author may still be shooting him/herself in the foot, it is a warning.

> headers="" will have a MUST requirement to point to TH elements in the
> same table,

Made it an error if the headers attribute doesn't point to TH  
elements in the same table.

> and will probably only be allowed on TDs.

Adjusted in RELAX NG.

> scope="" will
> probably only be allowed for THs.

Adjusted in RELAX NG.

> Maybe it should be REQUIRED for THs that
> aren't in obvious locations (first row, first column, or whatever).

Didn't do anything about this, yet. I think more discussion is  
needed. And more precision if this idea still stands after discussion.

> It might be interesting to have some sort of testing with the "axis"
> attribute too, or maybe we should drop it. (Indeed maybe we should  
> drop
> some of the others, too.)

Didn't do anything about this.

Specifically, the errors are:
  * Table cell is overlapped by later table cell.
  * Table cell overlaps an earlier table cell. (Single overlap gets  
reported in both directions to show source location for both cells.)
  * Table cell spans past the end of its row group.
  * Row has no cells starting on it.
  * Table row column count is greater than the column count  
established by cols/colgroups.
  * Table row column count is less than the column count established  
by cols/colgroups.
  * The headers attribute doesn't point to th elements in the same  
table.
  * Column has no cells starting on it. (Contiguous cell ranges  
established by a single element are coalesced to a single error to  
protect against denial of service attacks.)

The warnings are:
  * colspan exceeds 1000, which is a magic number in Gecko (and  
according to comments in Gecko source, in IE and Opera, too)
  * rowspan exceeds 8190, which is a magic number in Gecko
  * Table row column count is greater than the column count  
established by the first row in the absence of cols/colgroups.
  * Table row column count is less than the column count established  
by the first row in the absence of cols/colgroups.
  * A col element causes a span attribute to be ignored on the parent  
colgroup.

Not deployed yet.

-- 
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/

Received on Thursday, 9 November 2006 03:18:46 UTC