Re: Heuristic Tests for Data Tables (Discussion)

James Graham wrote:
> 
> James Graham wrote:
>>
>> Ben 'Cerbera' Millard wrote:
>>> I wonder how many tables can be made natively accessible? How many 
>>> will need to be retrofitted by authors with <th> or scope="" or 
>>> headers="" and how likely is that? I guess more studying (like Philip 
>>> and I and others have done) and prototyping of implementations (like 
>>> James Graham might do)
>>
>> There's some very early work on this available at [1] (only the HTML4 
>> algorithms are currently implemented). Due to a bug in a html5lib 
>> serializer badness occurs when you give it a page containing more than 
>> one table. I also haven't checked that it's giving the correct results 
>> in almost any cases. However if you want to report bugs feel free.
> 
> This now also has some work on the algorithm from the HTML 5 spec, with 
> similar caveats as before i.e. it is hideously under tested.

I have now moved the table inspector to [1] and added an "experimental" option 
which is currently based on the HTML 4 algorithm with some improvements:

  * Optionally treats <td><strong> and <td><b> as headers
  * Ignores headers with @scope set when looking for implicit headings (to see 
why this is important look at the "Day 2" cell in [4]).
  * In rows or columns that consist only of headings the other headings from 
that row or column are not applied to cells in the row/col.

The code is available under the Apache License 2.0 at [2] (more specifically 
[3]). Patches, suggestions for improvement and testcases welcome.

[1] http://james.html5.org/tables/table_inspector.html
[2] http://code.google.com/p/html5/
[3] http://html5.googlecode.com/svn/trunk/tables/
[4] http://annevankesteren.nl/2007/09/tmb-overview

-- 
"Eternity's a terrible thought. I mean, where's it all going to end?"
  -- Tom Stoppard, Rosencrantz and Guildenstern are Dead

Received on Tuesday, 4 September 2007 12:14:31 UTC