Change proposal for ISSUE-32

ISSUE-32
========

SUMMARY
Drop the summary="" attribute.

RATIONALE

HTML has a feature that allows multidimensional data to be marked up
and presented in a primarily two-dimensional fashion, namely the
<table> element. This feature also has a few features to express more
complex data, such as <th> vs <td>, headers="", scope="",
<thead>/<tbody>/<tfoot>, and colspan=""/rowspan="".

Users of screen readers are able to navigate straight-forward
two-dimensional tables reasonably easily; screen readers have
developed a set of navigation features that allows users to quickly
skim cells horizontally and vertically and also enables users to
easily determine their current position. A simple table with a series
of data cells with the top row and left column containing headers can
therefore be read relatively simply by screen-reader users, by
skimming the first row to get an idea of the fields in the data,
skimming the first column to get an idea of the various options that
the table covers, and then walking through to the relevant cells to
get whatever information is desired, potentially walking a series of
cells in a row or column to get information relating to the range of
the data.

Users of visual user agents [1] interact with such tables in a
remarkably similar way, first reading the headers in the first row of
the table, then reading the headers of the rows, and then using this
information to pin down the cell or series of cells in which they are
interested. However, it is typically a much more instinctive behaviour
than the more belaboured and interactive experience of a screen-reader
user.

([1] For the purpose of this discussion, I shall consider
screen-reader/ browser combinations as being non-visual user agents,
even though in they are actually strictly speaking visual user agents
also.)

In addition, screen readers would be most helpful to their users if
they could programmatically summarise table structures
automatically. Indeed, many do report basic table information such as
the number of rows and columns; going forward, it seems likely that
this can and should be improved to describe basic table types, so that
even simpler tables or tables that might lack necessary descriptive
text can be explained.

However, things get more difficult with complicated tables such as
some of the ones studied by Ben a few years ago:

   http://projectcerbera.com/web/study/2007/tables
   http://projectcerbera.com/web/study/2008/tables

For these, users -- both users of visual user agents and users of
screen readers -- would benefit greatly from some human-written
explanatory or introductory text. Screen reader users are especially
in need of such text, since they cannot see the patterns that visual
user might see.

Explanatory text could be put in several places:

 - Before the table in the prose:

     <p>...</p>
     <table>...</table>

 - After the table in the prose:

     <table>...</table>
     <p>...</p>

In the two cases above, ARIA attributes could be used to more tightly
couple the two to enable screen readers to provide a link between
them.

 - As part of a <figure> with the table:

     <figure>
      <p>...</p>
      <table>...</table>
     </figure>

 - As part of a caption:

     <table>
      <caption>
       ...
       <p>...</p>
      </caption>
      ...
     </table>

All of the examples above are about equivalent; different authors
might prefer different options in different cases. (The spec
encourages the fourth, with the caption, because it links the
explanatory text to the table in a clear way for screen readers, has
the preferred behaviour in existing screen-readers, and doesn't
require the use of a separate <figure> element, which is not always
desireable.)

US goverment advice on how to include explanatory text suggests using
the <caption> or putting content adjacent to the table, as in the
first four solutions above:

| [...] web developers who are interested in summarizing their tables 
| should consider placing their descriptions either adjacent to their 
| tables or in the body of the table, using such tags as the CAPTION tag.
 -- http://www.access-board.gov/sec508/guide/1194.22.htm#(g)

We are therefore in good company in recommending these techniques.

 - Introducing a new element around <table>, e.g.:

     <table>
      <summary> ... </summary>
      ...
     </table>

Unfortunately there are parsing issues with this.

 - Introducing a new element inside <caption>, e.g.:

     <table>
      <caption>
       ...
       <summary>...</summary>
      </caption>
      ...
     </table>

 - Introducing a new element inside <figure>, e.g.:

     <figure>
      <summary>...</summary>
      <table>...</table>
     </figure>

This would make sense if the summary content was rendered very
differently than other content in specific media, but in practice in
ATs the summary content is just read out like caption content, so it
wouldn't add much here, and in other UAs the author would be able to
just style it using CSS. (Media queries can also be used to hide
content specifically from particular media, e.g. having text not
appear on screen.)

 - Reusing <details>:

     <table>
      <caption>
       ...
       <details>
        <legend> Help... </legend>
        ...
       </details>
      </caption>
      ...
     </table>

This is again reasonable, and the spec does allow this, so it could be
used if desired.

 - Using the summary="" attribute from HTML4:

     <table summary="...">
      ...
     </table>

This last option, which is currently allowed (though discouraged) in
the HTML5 spec, has a number of drawbacks. It only allows simple,
un-marked-up text; it isn't visible to non-screen-reader users in
legacy user agents; and visual media browsers would not want to show
this content inline in legacy content because it would cause legacy
content to change rendering in a non-backwards-compatible manner.

However, some have argued that the summary="" attribute is a better
solution to the problem described above than the other solutions
suggested above.

Here is some empirical data that suggests otherwise.

   http://www.paciellogroup.com/blog/misc/summary.html

   A manual crawl of government pages with a summary="". Let us
   examine these tables. First, it's worth noting that this data
   represents the best of the best that we might expect from Web
   authors -- the authors of these tables were legally bound to make
   them as accessible as possible, so it's unlikely that we will see
   any better results anywhere.

   http://www.fbi.gov/ucr/cius2007/data/table_01.html

      The summary="" value would be useful to all users, yet users
      that don't see the attribute's value have no way to find out the
      information.

   http://www.cdc.gov/asthma/nhis/04/table1-1.htm

      The summary="" value duplicates information in the page header
      and the page title, and could without harming visual users have
      been put in the caption. In addition, again, the summary="" has
      information that is not available anywhere else, leaving non-AT
      users out in the cold.

   http://www.eia.doe.gov/cneaf/electricity/epm/table1_13_a.html

      This entire table is non-conforming (it's a layout table), so it
      doesn't matter if we allow summary="" for it or not. The table
      shouldn't exist. No summary required. (This summary="" is
      inserted by script, no less.)

   http://www.vrg.org/journal/vj2006issue2/vj2006issue2mealplans.htm

      There are a number of tables here, and none of them have useful
      summary="" attributes. A number of them duplicate existing
      captions, leading to a suboptimal experience for users of AT
      products. Several of them have one-word summaries that are more
      vague than the first cell of the table, and therefore do nothing
      to help the user. The remainder are layout tables that would be
      better done using <dl>, and the summary="" attributes would be
      better as captions if they are needed at all.

   http://www.irs.gov/formspubs/article/0,,id=164272,00.html

      These summary="" attributes once again give information that I
      would find useful as a non-AT user. They also duplicate some of
      the information in the headers before the table. Would be better
      as a caption.

   http://www.hrsa.gov/vaccinecompensation/table.htm

      The summary says something that is not about the table that
      isn't provided anywhere else, and it repeats information in the
      caption.  (The summary is "National Childhood Vaccine Injury Act
      Vaccine Injury Table", whereas the page is titled "National
      Vaccine Injury Compensation Program" and the table is captioned
      "Vaccine Injury Table".) I can't tell if this is because the
      summary is out of date, or because the page header is out of
      date, but in either case, _someone_ would be better off if the
      summary hadn't been there, and the AT user would not have been
      worse off.

   http://www.cslib.org/finespenalt.htm

      The summary is a word-for-word repeat of the first row's table
      header cells, which doesn't seem helpful since the user would
      get the same information either way.

   http://aspe.hhs.gov/poverty/faq.shtml

      Both tables are non-conforming. The AT user would be better off
      with neither table nor summary.

   http://www.ssa.gov/OACT/STATS/table4c6.html

      The table with the summary="" is non-conforming; the AT user
      would again be better off without either it or its summary.

   http://www.nhlbi.nih.gov/guidelines/obesity/bmi_tbl.htm

      The summary is a useless repetition of the header before it, and
      it is the caption that includes information on how to use the
      table!

   The conclusion I draw from this data is that summary="" hurts users
   who don't have access to it, hiding information that they could
   use, hurts users who DO have access to it, encouraging people to
   consider layout tables acceptable; and hurts the authors writing
   these tables, wasting their time writing summaries when their time
   would be better spent making pages accessible to _everyone_.

   It's worth noting again that this data is representative of the
   very best that we can expect from Web authors.


   http://www.youtube.com/watch?v=xMGBX8gAM6g#t=0m30s

   Usability study. A blind user, using JAWS, upon being introduced to
   a sample table with the summary="" attribute, says, unprompted:
   "Now it gave a little summary information there. And I'm wondering,
   how necessary is that. [...] I'm thinking it's too much. [...] I
   think you'll find that information yourself anyway by just
   exploring the table." He then goes on to say that other people
   might disagree, but adds "but for me, they're annoying". He also
   notes that he believes he has the feature disabled in his
   installation, though this contradicts statements by Steven saying
   that summaries aren't disablable in Jaws:

      http://lists.w3.org/Archives/Public/public-html/2009Jun/0282.html


   http://canvex.lazyilluminati.com/misc/summary.html
   http://canvex.lazyilluminati.com/misc/summary-20090226.html
   http://philip.html5.org/data/table-summary-values-dotbot.html

   Automated crawls through two different corpuses. These show actual
   values of summary="", unfiltered for layout tables. Simon went
   through the last (and biggest) list one at a time, and reported
   finding only one page (out of 425,000) with a summary="" value that
   actually fit the recommended guidelines, and pointed out that for
   that table, the summary was in fact redundant and didn't help
   accessibility:

      http://lists.w3.org/Archives/Public/public-html/2009Jun/0698.html

   Of the other values, almost all are outright bogus ("pid991460"),
   but some have values that appear to be well-meaning but of
   questionable practical use, such as "Calendar". In fact, when I
   myself went and looked at this set of pages in more detail, I
   concluded that even summary="" values that look like they have a
   chance at being useful aren't actually useful:

      http://lists.w3.org/Archives/Public/public-html-a11y/2009Dec/0104.html


Overall I think the data pretty clearly speaks to the problems that
summary="" have today. After ten years of evangelisation and education
efforts, authors *who intend to help users with accessibility needs*
still do not use the attribute in a useful manner. That these
well-meaning authors so fundamentally don't understand how to make
table explanations useful IMHO is an indication that we need to change
how we are going about the problem. This suggests that it is better to
suggest authors include explanatory text in an immediately visible
manner. This would force them to see the text even if they do only the
most primitive of QA (as apparently many do). If the authors see the
text, then they are more likely to make it sensible. This would then
help the users they want to help, and the users for which we want to
make the Web a better place.

Furthermore, the summary="" attribute is intended only for non-sighted
users, which runs contrary to our design principles of universal
access. Hiding information from sighted users even when the
information would be useful to them is not good design. Consider the
exact opposite case: <canvas> and <img> are intended only for sighted
users. Does that mean that it's ok for the content in those elements
to be hidden from non-sighted users? No! We have to convey the
information from those elements to _all_ users, hence <canvas>
fallback and alt="".

Naturally, supporting legacy content that already uses the summary=""
attribute should not be prevented; to this end, HTML5 in fact
encourages user agents (such as screen readers) to expose the contents
of summary="" attributes, even though the attribute isn't part of the
language. That should not be changed.


In conclusion, there's been no evidence presented that there are
authors who:

 * have tables complicated enough that non-visual users need a
   description, and

 * are able to write a description, and

 * are not willing to expose this description to all users, and

 * are not willing to use CSS techniques or <details> to hide the
   information from the default visual presentation, and

 * will remember to update the attribute when the table changes.

There is, however, ample evidence that authors who are convinced (by
advocacy) that they fall into the above situation in fact fail to fall
into it, and end up creating harmful content (see above for
examples). There is also ample evidence that having the attribute
present encourages authors to include descriptions when they are not
necessary, wasting their time and the time of their AT-using readers.

Therefore, having the attribute causes more harm than not having it,
and we should remove it.


DETAILS
Make the summary="" attribute entirely obsolete.

IMPACT

POSITIVE EFFECTS
* Improved overall accessibility of the Web.
* Reduced waste of authoring time.

NEGATIVE EFFECTS
None.

CONFORMANCE CLASS CHANGES
Authors: The attribute is made entirely obsolete.

RISKS
There may be cases where there is some tabular data of a nature
complicated enough to warrant a table explanation for screen reader
users (and presumably for other users also), but for which the author
will refuse to include visible explanatory text yet is amenable to
including hidden explanatory text. It seems highly unlikely that such
a case exists, and indeed no examples of such a case have ever been
put forward, but if such a case exists then there could be an argument
for leaving the summary="" attribute.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Thursday, 15 July 2010 18:00:47 UTC