- From: Ian Hickson <ian@hixie.ch>
- Date: Sun, 5 Jul 2009 11:14:21 +0000 (UTC)
- To: public-html@w3.org
(Murray asked me to start a new thread about this today, outlining my
thoughts. Hopefully this will help.)
HTML has a feature that allows multidimensional data to be marked up and
presented in a primarily two-dimensional fashion, namely the <table>
element. This feature also has a few features to express more complex
data, such as <th> vs <td>, headers="", scope="", <thead>/<tbody>/<tfoot>,
and colspan=""/rowspan="".
Users of screen readers are able to navigate straight-forward
two-dimensional tables reasonably, easily; screen readers have developed a
set of navigation features that allows users to quickly skim cells
horizontally and vertically and also enables users to easily determine
their current position. A simple table with a series of data cells with
the top row and left column containing headers can therefore be read
relatively simply by screen-reader users, by skimming the first row to get
an idea of the fields in the data, skimming the first column to get an
idea of the various options that the table covers, and then walking
through to the relevant cells to get whatever information is desired,
potentially walking a series of cells in a row or column to get
information relating to the range of the data.
Users of visual user agents [1] interact with such tables in a remarkably
similar way, first reading the headers in the first row of the table, then
reading the headers of the rows, and then using this information to pin
down the cell or series of cells in which they are interested. However, it
is typically a much more instinctive behaviour than the more belaboured
and interactive experience of a screen-reader user.
([1] For the purpose of this discussion, I shall consider screen-reader/
browser combinations as being non-visual user agents, even though in they
are actually strictly speaking visual user agents also.)
In addition, screen readers would be most helpful to their users if they
could programmatically summarise table structures automatically. Indeed,
many do report basic table information such as the number of rows and
columns; going forward, it seems likely that this can and should be
improved to describe basic table types, so that even simpler tables or
tables that might lack necessary descriptive text can be explained.
However, things get more difficult with complicated tables such as some of
the ones studied by Ben a few years ago. [2][3]
[2] http://projectcerbera.com/web/study/2007/tables
[3] http://projectcerbera.com/web/study/2008/tables
For these, users -- both users of visual user agents and users of screen
readers -- would benefit greatly from some human-written explanatory or
introductory text. Screen reader users are especially in need of such
text, since they cannot see the patterns that visual user might see.
Explanatory text could be put in several places:
- Before the table in the prose:
<p>...</p>
<table>...</table>
- After the table in the prose:
<table>...</table>
<p>...</p>
In the two cases above, ARIA attributes could be used to more tightly
couple the two to enable screen readers to provide a link between them.
- As part of a <figure> with the table:
<figure>
<p>...</p>
<table>...</table>
</figure>
- As part of a caption:
<table>
<caption>
...
<p>...</p>
</caption>
...
</table>
All of the examples above are about equivalent; different authors might
prefer different options in different cases. (The spec encourages the
fourth, with the caption, because it links the explanatory text to the
table in a clear way for screen readers, has the preferred behaviour in
existing screen-readers, and doesn't require the use of a separate
<figure> element, which is not always desireable.)
- Introducing a new element around <table>, e.g.:
<table>
<summary> ... </summary>
...
</table>
Unfortunately there are parsing issues with this.
- Introducing a new element inside <caption>, e.g.:
<table>
<caption>
...
<summary>...</summary>
</caption>
...
</table>
- Introducing a new element inside <figure>, e.g.:
<figure>
<summary>...</summary>
<table>...</table>
</figure>
This would make sense if the summary content was rendered very differently
than other content in specific media, but in practice in ATs the summary
content is just read out like caption content, so it wouldn't add much
here, and in other UAs the author would be able to just style it using
CSS. (Media queries can also be used to hide content specifically from
particular media, e.g. having text not appear on screen.)
- Reusing <details>:
<table>
<caption>
...
<details>
<legend> Help... </legend>
...
</details>
</caption>
...
</table>
This, rather while complicated, and thus not likely to be widely used by
authors (especially not used correctly by authors) if we were to suggest
it as the primary mechanism, is still reasonable, and the spec does allow
this, so it could be used if desired.
- Using the summary="" attribute from HTML4:
<table summary="...">
...
</table>
This last option has a number of drawbacks. It only allows simple,
un-marked-up text; it isn't visible to non-screen-reader users in legacy
user agents; and visual media browsers would not want to show this content
inline in legacy content because it would cause legacy content to change
rendering in a non-backwards-compatible manner. I'm skeptical that this
is an effective way to actually solve the problem.
Naturally, supporting legacy content that already uses the summary=""
attribute should not be prevented; to this end, HTML5 in fact encourages
user agents (such as screen readers) to expose the contents of summary=""
attributes, even though the attribute isn't part of the language.
US goverment advice on how to include explanatory text suggests using the
<caption> or putting content adjacent to the table, as in the first four
solutions above:
| [...] web developers who are interested in summarizing their tables
| should consider placing their descriptions either adjacent to their
| tables or in the body of the table, using such tags as the CAPTION tag.
-- http://www.access-board.gov/sec508/guide/1194.22.htm#(g)
Some have argued that the summary="" attribute is a better solution to the
problem described above than the other solutions suggested above.
Here is some empirical data that suggests otherwise.
http://www.youtube.com/watch?v=xMGBX8gAM6g#t=0m30s
Usability study. A blind user, using JAWS, upon being introduced to
a sample table with the summary="" attribute, says, unprompted:
"Now it gave a little summary information there. And I'm wondering,
how necessary is that. [...] I'm thinking it's too much. [...] I
think you'll find that information yourself anyway by just
exploring the table." He then goes on to say that other people
might disagree, but adds "but for me, they're annoying". He also
notes that he believes he has the feature disabled in his
installation, though this contradicts statements by Steven saying
that summaries aren't disablable in Jaws. [4]
[4] http://lists.w3.org/Archives/Public/public-html/2009Jun/0282.html
http://www.paciellogroup.com/blog/misc/summary.html
A manual crawl of government pages with a summary="". I went
through this in detail in a contemporary e-mail [5], and
controversially concluded that "summary="" hurts users who don't
have access to it, hiding information that they could use, hurts
users who DO have access to it, encouraging people to consider
layout tables acceptable; and hurts the authors writing these
tables, wasting their time writing summaries when their time would
be better spent making pages accessible to _everyone_". Leif
questioned some of my comments [6], but I believe my conclusion
stands up to his close scrutiny.
[5] http://lists.w3.org/Archives/Public/public-html/2009Feb/0601.html
[6] http://lists.w3.org/Archives/Public/public-html/2009Jun/0285.html
http://canvex.lazyilluminati.com/misc/summary.html
http://canvex.lazyilluminati.com/misc/summary-20090226.html
http://philip.html5.org/data/table-summary-values-dotbot.html
Automated crawls through two different corpuses. These show actual
values of summary="", unfiltered for layout tables. Simon went
through the last (and biggest) list one at a time, and reported
finding only one page (out of 425,000) with a summary="" value that
actually fit the recommended guidelines, and pointed out that for
that table, the summary was in fact redundant and didn't help
accessibility. [7]
Of the other values, almost all are outright bogus ("pid991460"),
but some have values that appear to be well-meaning but of
questionable practical use, such as "Calendar".
[7] http://lists.w3.org/Archives/Public/public-html/2009Jun/0698.html
I've previously gone through this data in more detail, e.g. in:
http://lists.w3.org/Archives/Public/public-html/2008Dec/0175.html
http://lists.w3.org/Archives/Public/public-html/2009Feb/0601.html
http://lists.w3.org/Archives/Public/public-html/2009Feb/0690.html
http://lists.w3.org/Archives/Public/public-html/2009Feb/0735.html
http://lists.w3.org/Archives/Public/public-html/2009Jun/0173.html
Overall I think the data pretty clearly speaks to the problems that
summary="" have today. After ten years of evangelisation and education
efforts, authors *who intend to help users with accessibility needs* still
do not use the attribute in a useful manner. That these well- meaning
authors so fundamentally don't understand how to make table explanations
useful IMHO is an indication that we need to change how we are going about
the problem. This is why I suggest telling them to include explanatory
text in an immediately visible manner. This would force them to see the
text even if they do only the most primitive of QA (as apparently many
do). If the authors see the text, then they are more likely to make it
sensible. This would then help the users they want to help, and the users
for which we want to make the Web a better place.
I think that if we are to find a new solution (other than those listed
above), or if we are to decide to use summary="" despite the flaws
described above, we need more information.
Specifically, to support summary="" I think the following would be useful:
* Data showing whether screen reader users actually use summary=""
attributes in their day-to-day life. Usability studies are the most
reliable and effective way to find this out. (Note that asking users is
not a good way to find this kind of information out. Users are
notoriously incapable of accurately describing their behaviour.)
* Data showing whether the values that are seen by users are actually
useful or not on the aggregate (it has been argued that this is
different than the values that are seen on the Web e.g. as in the
data cited above, because ATs apparently filter that data). A
random crawl that applies the same filter as the ATs is probably the
method that would get us the most data for this, but it may be
impractical depending on what filter the ATs use. Examining a small set
of URLs manually with an AT based on a previous crawl to find potential
candidate pages randomly may be more practical.
* If the values that appear in the data collected for the previous bullet
point include some of the more questionable values, rather than only
unambiguously good values, then an explanation of why such values are
useful, or even better, data showing that such values are indeed
useful, e.g. from a usability study looking at such pages specifically.
To support <summary>, the following would be useful:
* Data showing that certain tools, user agents, authors, or users treat
explanatory text about tables in a substantially different way than
caption text or surronding prose.
In the absence of this data, I don't think we have enough grounds to
continue supporting summary="" or to introduce a new element. Clearly,
others disagree.
I feel I must point out that we have used the exact same data-driven
process for every single feature in HTML5. In some cases, we don't have
much data to go on; in others, we have a lot. But we have used the same
methodology for every feature in the language. This is no exception.
I would welcome input from the chairs regarding how to resolve this issue.
Personally I don't think this is a difficult issue; it seems that there is
a clearly technically inferior solution being proposed (summary="") that
has been demonstrated to not actually solve the problem described at the
top of this e-mail. So to me, it seems that if we are basing HTML5's
development on purely technical grounds and arguments, and not listening
to the volume of the discourse, that the way forward is clear; we should
adopt one or more of the solutions proposed that do not suffer from the
same design problems as the summary="" attribute.
If the chairs disagree, and believe that this is a non-technical issue, or
believe that technical issues should be resovled by vote, then I would
recommend having something like the following options:
( ) I support the design of the HTML4 working group.
(Including the summary="" attribute on tables.)
( ) I support the design currently in Ian's HTML5 proposal.
(Suggesting that tables should be described in captions.)
( ) I support the design currently in Rob's HTML5 proposal.
(Allowing summary="", but saying it doesn't work.)
( ) I have another proposal. Describe it below.
Cheers,
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Sunday, 5 July 2009 11:14:59 UTC