Systematic approach to layout table heuristics

After filing bug 24679 [1], which suggests to add more features to the 
list of ”possible indicators”[2] for layout tables vs non-layout 
tables, I was asked to provide data for the proposals. But how should 
one go about providing data?

Web data is of what ultimately matters. But when identifying possible 
indicators, it seems that knowing what features AT actually use for 
discerning between layout and non-layout tables, would give us a good 
shortlist of candidate indicators. Secondly, if we can spot some trends 
in the (hopefully correct) data we already have, then that - as well - 
could help us identify candidates. 

Today, its appears possible to glean following trends in the spec:

1) Conformance, completeness amd semantics indicate non-layout
   usage. The spec lists borders via CSS or @border=1, <th>,  
   <thead>, <caption>, @headers, @scope.
2) Non-conforming ways to disable borders (border=0/cellspacing=0
   cellpadding=0) plus role=presentation indicate layout usage.
3) @summary is ”not a good indicator” (this is probably based
   *both* on Web data *and* AT behavior analysis

   Trends & AT data applied to: table@border

To me, HTML5’s table over “possible indicators” of (non-)layout usage, 
seems correct, but not complete. However, in bug 24678,[3] Steve asks 
for data about *one* non-layout indicator, namely border=1. But which 
data? Web data? Assistive technology data? The fact that VoiceOver + 
Safari matches spec w.r.t. borders as non-layout indication, ought to 
put it on the shortlist of non-layout features. In another, somewhat 
related, bug, Steve indicated that web data did not support that 
table@border=1 indicates data tables.[4] My own toe dipping into same 
pool told me the opposite.

   Trends & AT data applied to: <colgroup> with <col/> children.

It seems in line with the conformance & completeness trend that 
VoiceOver+Safari treats <colgroup> with <col/> children[*] as indicator 
of non-layout usage. What do other ATs do? What does Web data say? 
Should the spec change accordingly? (PS: I know one HTML generator 
which, for layout tables in XHTML1/HTML4, uses cellpadding=0 
cellspacing=0, but which for HTML5 removes cellpadding/cellspacing and 
*adds* <colgroup> with children, with bad results in VoiceOver as 
result.)

   Trends & AT data applied to: lack of cellpadding=0 cellspacing=0[*]

Is it risky to delete cellpadding=0 cellspacing=0 from a table? 
Could it cause an AT to treat a layout table, as a non-layout table? 
This question seems relevant since authors are probably simply deleting 
these attributes without adding @role=presentation in their place.

   Trends & AT data applied to: presence of sortable attribute

This is good candidate based on its strong link to data tables - it is 
a semantic feature. But is is also possible to take the view that we 
need implementation and usage before putting into the spec.

Finally: How important are heuristics for identifying data tables? Does 
the situation remind about how UAs detect quirks vs no-quirks? (Except 
in XHTML and @srcdocs (other exceptions?), no-quirks is triggered when 
UAs detect a DOCTYPE that meets certain conformance and completeness 
criteria.)

[1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=24679

[2] 
http://www.w3.org/html/wg/drafts/html/master/tabular-data.html#the-table-element

[3] https://www.w3.org/Bugs/Public/show_bug.cgi?id=24678

[4] https://www.w3.org/Bugs/Public/show_bug.cgi?id=24647#c7

[*] I did not check omitting <col/> child or omitting parent <colgroup>
-- 
leif halvard silli

Received on Friday, 21 February 2014 10:20:05 UTC