W3C home > Mailing lists > Public > public-html@w3.org > September 2007

Re: Data Table Collections (Research)

From: James Graham <jg307@cam.ac.uk>
Date: Sat, 29 Sep 2007 19:41:32 +0100
Message-ID: <46FE9C5C.8010804@cam.ac.uk>
To: Ben 'Cerbera' Millard <cerbera@projectcerbera.com>
Cc: HTMLWG <public-html@w3.org>

Ben 'Cerbera' Millard wrote:
> 
> I've been a busy bee! More collecting, analysing and simulated 
> retrofitting:
> 
> <http://sitesurgeon.co.uk/tables/readme.html>

This is really interesting :)

> Dumping links to tables with a short list of observations is hopefully 
> helpful in the [need to categorise] trends and patterns in the way 
> authors are building tables. I intend to summarise my observations after 
> analysing more tables. Perhaps in time for the [November meeting] in 
> Boston which I'll attend.

Excellent. I've had a look at a random selection of the tables you 
produced in the table inspector; where the headings are marked up 
at-all, the smart colspan algorithm seems to do a very good job at 
getting the column headers right, which is encouraging.

> I've taken a snapshot of the Wisconsin University budget tables done 
> some simulated retrofits:
> 
> <http://sitesurgeon.co.uk/tables/finance/>
> 
> It seems they don't strictly need headers+id to express the 
> relationships. And their use of headers+id was broken: one bogus value 
> and several empty string values. There are notes about this in the main 
> readme, under the "Finance" section.

It is the prevalence of this sort of problem that makes me wonder if 
giving people the power of headers+id isn't just enabling authors to 
shoot themselves in the foot, accessibility wise. Even if we do want to 
keep it, I think the spec should be clear that it is only to be used 
when other solutions cannot suffice.

> The opening sentence is particularly interesting to me:
> 
> [[[
> One of the most common accessibility problems we find when conducting 
> the Better Connected survey is the lack of marked up headers in data 
> tables.
> ]]]

The example table is interesting there because the row headers are not 
in the first column of the table. I think that's a case where we will 
always need the author to do the right thing in the markup - i.e. to use 
<th> for headers - because we will be unable to get the right answer 
heuristically.

> This correlates with an [earlier finding] where I noticed most tables 
> don't use <th>.

One possibility is to make tables having 1+ <th> a conformance 
requirement, however I think there might be use cases (except tables for 
layout) where this isn't appropriate. [1] was pointed out to me as an 
example of this; I don't think it's really tabular data but I'm not sure 
what alternative markup you would use to present the same information.

> So if we can find ways for UAs to figure out which are the headers in 
> tables which aren't using <th>, that might be an instant win for 
> accessibility. It's a lot easier if authors use <th> and retrofit legacy 
> content, of course. But we can continue [evangalising] better authoring 
> practices whilst also giving UAs a headstart.

My current thinking on this is that we should do it for common, simple, 
cases but trying to make extensive use of e.g. attributes or child 
elements of the <td> element is probably too error prone to be worth 
specifying. However this might be an area where it is OK to say UAs MAY 
associate additional headers with a cell using heuristic methods.

The lowest hanging fruit I can see are tables where there are no headers 
marked and the first n rows and m columns are headers. This case seems 
common enough to be worth specifying some handling for. The trick is to 
get the right values of n and m. Possibly we can look at the top left of 
the table and a) if the top left cell has colspan > 1 or rowspan > 1 we 
set n=rowspan and m=colspan. Otherwise we look for empty cells and set 
n=#empty rows and m=#empty columns. If there are no empty rows or 
columns, we assume n=m=1. However this will not always get the right 
answer e.g. with [2] where the P column would become a row header.

[1] http://www.pointerklubben.se/stamtavla.asp?Id=S35236/97
[2] http://news.bbc.co.uk/sport1/hi/football/eng_prem/table/default.stm
-- 
"Mixed up signals
Bullet train
People snuffed out in the brutal rain"
--Conner Oberst
Received on Saturday, 29 September 2007 18:41:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:08 GMT