- From: Ben 'Cerbera' Millard <cerbera@projectcerbera.com>
- Date: Fri, 24 Aug 2007 19:20:42 +0100
- To: "Philip Taylor" <philip@zaynar.demon.co.uk>
- Cc: "HTMLWG" <public-html@w3.org>
Philip Taylor wrote: >With the data I collected a while ago [1], [...] it seems as important to >determine layout vs data for tables that do have <th> as much as for those >that don't. Indeed. And it seems my fears about <td>-only data tables are worse than I thought. Of the 33 tables I have[ collected] from the web, only 13 used <th> at all. It was rare for row headers to use <th>; only 6 did that. [collected] <http://sitesurgeon.co.uk/tables/> All the sports tables I looked at on ESPN's website are <td>-only. A sample of them: 1. <http://sports.espn.go.com/mlb/stats/aggregate?statType=fielding&group=9> Header cells with same colspan="" value overwrite each other. The boldness of the heading text is applied from an external stylesheet via class="colhead" on the parent <tr>. So a simple heuristic like <td><b> = <th> wouldn't work here. At least there is a clear migration path: swap <td>s in <tr>s with class="colheader" to <th>s...but why didn't they just use <th> to start with? 2. <http://sports.espn.go.com/rpm/results?seriesId=8> Borderline layout table. 3. <http://sports.espn.go.com/rpm/schedule?seriesId=1> Borderline layout table. 4. <http://sports.espn.go.com/golf/players/profile?playerId=462> Some cells seem to have too much information in them...not really a "cell" of data when there are several values about different things. 5. <http://sports.espn.go.com/nhl/boxscore?gameId=270519002> Mixed of layout effects around data cells at the start; bonafide layout tables; regular number-heavy data tables with spanned headers where simple format sniffing looks like it would work. 6. <http://sports.espn.go.com/golf/statistics?sort=officialAmount> Quite regular data table with header overwriting similar to #1. First cell spans all columns and would be correct if implied as a <caption>. Basic format sniffing looks quite promising except for the "Player" column, where it would need to check for presence of markup (name <a href>). Yeah, you could spend all day every day for months investigating the tables on ESPN's site. :-) Eurosport's website became a part of Yahoo! this year: <http://eurosport.yahoo.com/> Tables on Eurosport are a very mixed bag. A sample of them: 1. <http://eurosport.yahoo.com/mo/standing/500/index.html> Headers use <th> and are in regular positions but summary="" contains garbage. 2. <http://eurosport.yahoo.com/football/fapremiership/calendar/regular/2007_08.html> <th> for headers in regular positions. <tr> with a single <td> which spans the whole width, splitting the data table into groups but only contains . No practical benefit to imply <tbody> for things like that? A hypen (-) fills some empty data cells near the bottom. <td> and <td>- perhaps mean the same as an <td>? At the top, there is a layout table whcih contains a calendar which is marked up as a data table with day names using <th>. Being able to determine layout versus data table in a nested context would be necessary here. 3. <http://eurosport.yahoo.com/football/fapremiership/standing/full_standing.html> The table summary="" makes sense but seems more like a <caption>. Regular position for headers marked up as <th>. The immediately preceeding <h2> ("Table - Full Standing") could be implied as the <caption> for the table? 4. <http://eurosport.yahoo.com/cr/sc/13550.html> Pairs of tables presented side by side without layout tables (!). <caption> is supplied for each table and contains sensible text. Column headers use <th scope="col"> and are in regular positions. Row headers use <th scope="row"> and are in regular positions. Some data legitimately uses <td rowspan> across the entire data area. One <ul> is used to supplement each Bowling table even though it's data seems columnar. (The first number doesn't make much sense without a "Ball" header.) Lots of unexpanded abbreviations. 5. <http://uk.messages.eurosport.yahoo.com/yahoo/Cricket/Teams/index.html> Category listing for a message board. The summary="" merely repeats the preceeding <h2>. <th> used for column headers, positioned in regular locations. (Message boards are a rich vein of layout tables and borderline data tables which could have a whole study to themselves.) One particularly interesting collection are the tables on the official site of the Intercontinental Rally Challenge (IRC): <http://www.ircseries.com/> 1. <http://www.ircseries.com/html/Standings_Results.asp> Column headers use <td align="center">. Perhaps this could be an alias for <th>? Main column headers use rotated text embedded in images with no alt="" text. Column sub-headers use flag icons with no alt="" text. Note that the main column headers do not span the sub column headers. This is rare, in my experience, but evidently it does exist. If a <th> is immediately preceeded by a <th> of the same colspan="" value, they must be added together to support tables like this. Tables have actually been nested inside each other and placed immediately after each other to produce some of the data tables on this page. I have never seen that before now. The data for some tables is available as Excel spreadsheets. Maybe TV Raman could use those in Emacspeak! :-P 2. <http://www.ircseries.com/html/Calendar.asp> Column headers in regular positions using <td><strong>. Perhaps that should be an alias for <th>? (If you consider <strong> to be an alias for <b> and <td><b> to be an alias for <th>, <td><strong> as an alias for <th> follows. It also stands alone thanks to this use case, imho.) I am not the first to investigate an aspect of their accessibility. The American Federation for the Blind checked ESPN among others in the 2005/2006 season. They mentioned problems with data tables on the NFL site: <http://www.afb.org/blog/blog_comments.asp?TopicID=1154> Sports are very mainstream, with sporting events being some of the biggest in the world. For example, the Olympics. Or the 10,000+ spectators in many sports stadiums around the world every weekend for either American Football, baseball, motor racing (especially NASCAR in the USA and Formula One around the world), soccer (especially in Europe), etc. AFB's website proves vision impaired people are interested in mainstream things like sports (and why wouldn't they be?). These tables could be more useful to them if they were more accessible. I wonder how many tables can be made natively accessible? How many will need to be retrofitted by authors with <th> or scope="" or headers="" and how likely is that? I guess more studying (like Philip and I and others have done) and prototyping of implementations (like James Graham might do) and testing in screen readers (like Steve Faulkner and others have done) and talking with content authors (like I've done and several are here themselves in HTMLWG) will help answer those questions over the coming years. Anyone can do research and testing into this and other things whenever they can find the time and motivation [1]. :-) [1] <http://lists.w3.org/Archives/Public/public-html/2007Aug/0968.html> -- Ben 'Cerbera' Millard Collections of Interesting Data Tables <http://sitesurgeon.co.uk/tables/readme.html>
Received on Friday, 24 August 2007 18:21:19 UTC