- From: Ian Hickson <ian@hixie.ch>
- Date: Sun, 5 Jul 2009 11:14:21 +0000 (UTC)
- To: public-html@w3.org
(Murray asked me to start a new thread about this today, outlining my thoughts. Hopefully this will help.) HTML has a feature that allows multidimensional data to be marked up and presented in a primarily two-dimensional fashion, namely the <table> element. This feature also has a few features to express more complex data, such as <th> vs <td>, headers="", scope="", <thead>/<tbody>/<tfoot>, and colspan=""/rowspan="". Users of screen readers are able to navigate straight-forward two-dimensional tables reasonably, easily; screen readers have developed a set of navigation features that allows users to quickly skim cells horizontally and vertically and also enables users to easily determine their current position. A simple table with a series of data cells with the top row and left column containing headers can therefore be read relatively simply by screen-reader users, by skimming the first row to get an idea of the fields in the data, skimming the first column to get an idea of the various options that the table covers, and then walking through to the relevant cells to get whatever information is desired, potentially walking a series of cells in a row or column to get information relating to the range of the data. Users of visual user agents [1] interact with such tables in a remarkably similar way, first reading the headers in the first row of the table, then reading the headers of the rows, and then using this information to pin down the cell or series of cells in which they are interested. However, it is typically a much more instinctive behaviour than the more belaboured and interactive experience of a screen-reader user. ([1] For the purpose of this discussion, I shall consider screen-reader/ browser combinations as being non-visual user agents, even though in they are actually strictly speaking visual user agents also.) In addition, screen readers would be most helpful to their users if they could programmatically summarise table structures automatically. Indeed, many do report basic table information such as the number of rows and columns; going forward, it seems likely that this can and should be improved to describe basic table types, so that even simpler tables or tables that might lack necessary descriptive text can be explained. However, things get more difficult with complicated tables such as some of the ones studied by Ben a few years ago. [2][3] [2] http://projectcerbera.com/web/study/2007/tables [3] http://projectcerbera.com/web/study/2008/tables For these, users -- both users of visual user agents and users of screen readers -- would benefit greatly from some human-written explanatory or introductory text. Screen reader users are especially in need of such text, since they cannot see the patterns that visual user might see. Explanatory text could be put in several places: - Before the table in the prose: <p>...</p> <table>...</table> - After the table in the prose: <table>...</table> <p>...</p> In the two cases above, ARIA attributes could be used to more tightly couple the two to enable screen readers to provide a link between them. - As part of a <figure> with the table: <figure> <p>...</p> <table>...</table> </figure> - As part of a caption: <table> <caption> ... <p>...</p> </caption> ... </table> All of the examples above are about equivalent; different authors might prefer different options in different cases. (The spec encourages the fourth, with the caption, because it links the explanatory text to the table in a clear way for screen readers, has the preferred behaviour in existing screen-readers, and doesn't require the use of a separate <figure> element, which is not always desireable.) - Introducing a new element around <table>, e.g.: <table> <summary> ... </summary> ... </table> Unfortunately there are parsing issues with this. - Introducing a new element inside <caption>, e.g.: <table> <caption> ... <summary>...</summary> </caption> ... </table> - Introducing a new element inside <figure>, e.g.: <figure> <summary>...</summary> <table>...</table> </figure> This would make sense if the summary content was rendered very differently than other content in specific media, but in practice in ATs the summary content is just read out like caption content, so it wouldn't add much here, and in other UAs the author would be able to just style it using CSS. (Media queries can also be used to hide content specifically from particular media, e.g. having text not appear on screen.) - Reusing <details>: <table> <caption> ... <details> <legend> Help... </legend> ... </details> </caption> ... </table> This, rather while complicated, and thus not likely to be widely used by authors (especially not used correctly by authors) if we were to suggest it as the primary mechanism, is still reasonable, and the spec does allow this, so it could be used if desired. - Using the summary="" attribute from HTML4: <table summary="..."> ... </table> This last option has a number of drawbacks. It only allows simple, un-marked-up text; it isn't visible to non-screen-reader users in legacy user agents; and visual media browsers would not want to show this content inline in legacy content because it would cause legacy content to change rendering in a non-backwards-compatible manner. I'm skeptical that this is an effective way to actually solve the problem. Naturally, supporting legacy content that already uses the summary="" attribute should not be prevented; to this end, HTML5 in fact encourages user agents (such as screen readers) to expose the contents of summary="" attributes, even though the attribute isn't part of the language. US goverment advice on how to include explanatory text suggests using the <caption> or putting content adjacent to the table, as in the first four solutions above: | [...] web developers who are interested in summarizing their tables | should consider placing their descriptions either adjacent to their | tables or in the body of the table, using such tags as the CAPTION tag. -- http://www.access-board.gov/sec508/guide/1194.22.htm#(g) Some have argued that the summary="" attribute is a better solution to the problem described above than the other solutions suggested above. Here is some empirical data that suggests otherwise. http://www.youtube.com/watch?v=xMGBX8gAM6g#t=0m30s Usability study. A blind user, using JAWS, upon being introduced to a sample table with the summary="" attribute, says, unprompted: "Now it gave a little summary information there. And I'm wondering, how necessary is that. [...] I'm thinking it's too much. [...] I think you'll find that information yourself anyway by just exploring the table." He then goes on to say that other people might disagree, but adds "but for me, they're annoying". He also notes that he believes he has the feature disabled in his installation, though this contradicts statements by Steven saying that summaries aren't disablable in Jaws. [4] [4] http://lists.w3.org/Archives/Public/public-html/2009Jun/0282.html http://www.paciellogroup.com/blog/misc/summary.html A manual crawl of government pages with a summary="". I went through this in detail in a contemporary e-mail [5], and controversially concluded that "summary="" hurts users who don't have access to it, hiding information that they could use, hurts users who DO have access to it, encouraging people to consider layout tables acceptable; and hurts the authors writing these tables, wasting their time writing summaries when their time would be better spent making pages accessible to _everyone_". Leif questioned some of my comments [6], but I believe my conclusion stands up to his close scrutiny. [5] http://lists.w3.org/Archives/Public/public-html/2009Feb/0601.html [6] http://lists.w3.org/Archives/Public/public-html/2009Jun/0285.html http://canvex.lazyilluminati.com/misc/summary.html http://canvex.lazyilluminati.com/misc/summary-20090226.html http://philip.html5.org/data/table-summary-values-dotbot.html Automated crawls through two different corpuses. These show actual values of summary="", unfiltered for layout tables. Simon went through the last (and biggest) list one at a time, and reported finding only one page (out of 425,000) with a summary="" value that actually fit the recommended guidelines, and pointed out that for that table, the summary was in fact redundant and didn't help accessibility. [7] Of the other values, almost all are outright bogus ("pid991460"), but some have values that appear to be well-meaning but of questionable practical use, such as "Calendar". [7] http://lists.w3.org/Archives/Public/public-html/2009Jun/0698.html I've previously gone through this data in more detail, e.g. in: http://lists.w3.org/Archives/Public/public-html/2008Dec/0175.html http://lists.w3.org/Archives/Public/public-html/2009Feb/0601.html http://lists.w3.org/Archives/Public/public-html/2009Feb/0690.html http://lists.w3.org/Archives/Public/public-html/2009Feb/0735.html http://lists.w3.org/Archives/Public/public-html/2009Jun/0173.html Overall I think the data pretty clearly speaks to the problems that summary="" have today. After ten years of evangelisation and education efforts, authors *who intend to help users with accessibility needs* still do not use the attribute in a useful manner. That these well- meaning authors so fundamentally don't understand how to make table explanations useful IMHO is an indication that we need to change how we are going about the problem. This is why I suggest telling them to include explanatory text in an immediately visible manner. This would force them to see the text even if they do only the most primitive of QA (as apparently many do). If the authors see the text, then they are more likely to make it sensible. This would then help the users they want to help, and the users for which we want to make the Web a better place. I think that if we are to find a new solution (other than those listed above), or if we are to decide to use summary="" despite the flaws described above, we need more information. Specifically, to support summary="" I think the following would be useful: * Data showing whether screen reader users actually use summary="" attributes in their day-to-day life. Usability studies are the most reliable and effective way to find this out. (Note that asking users is not a good way to find this kind of information out. Users are notoriously incapable of accurately describing their behaviour.) * Data showing whether the values that are seen by users are actually useful or not on the aggregate (it has been argued that this is different than the values that are seen on the Web e.g. as in the data cited above, because ATs apparently filter that data). A random crawl that applies the same filter as the ATs is probably the method that would get us the most data for this, but it may be impractical depending on what filter the ATs use. Examining a small set of URLs manually with an AT based on a previous crawl to find potential candidate pages randomly may be more practical. * If the values that appear in the data collected for the previous bullet point include some of the more questionable values, rather than only unambiguously good values, then an explanation of why such values are useful, or even better, data showing that such values are indeed useful, e.g. from a usability study looking at such pages specifically. To support <summary>, the following would be useful: * Data showing that certain tools, user agents, authors, or users treat explanatory text about tables in a substantially different way than caption text or surronding prose. In the absence of this data, I don't think we have enough grounds to continue supporting summary="" or to introduce a new element. Clearly, others disagree. I feel I must point out that we have used the exact same data-driven process for every single feature in HTML5. In some cases, we don't have much data to go on; in others, we have a lot. But we have used the same methodology for every feature in the language. This is no exception. I would welcome input from the chairs regarding how to resolve this issue. Personally I don't think this is a difficult issue; it seems that there is a clearly technically inferior solution being proposed (summary="") that has been demonstrated to not actually solve the problem described at the top of this e-mail. So to me, it seems that if we are basing HTML5's development on purely technical grounds and arguments, and not listening to the volume of the discourse, that the way forward is clear; we should adopt one or more of the solutions proposed that do not suffer from the same design problems as the summary="" attribute. If the chairs disagree, and believe that this is a non-technical issue, or believe that technical issues should be resovled by vote, then I would recommend having something like the following options: ( ) I support the design of the HTML4 working group. (Including the summary="" attribute on tables.) ( ) I support the design currently in Ian's HTML5 proposal. (Suggesting that tables should be described in captions.) ( ) I support the design currently in Rob's HTML5 proposal. (Allowing summary="", but saying it doesn't work.) ( ) I have another proposal. Describe it below. Cheers, -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Sunday, 5 July 2009 11:14:59 UTC