Tables accessibility thoughts from Harvey Bingham on 1998-05-13 (w3c-wai-au@w3.org from April to June 1998)

From: Harvey Bingham <hbingham@ACM.org>
Date: Wed, 13 May 1998 14:48:11 -0400
To: w3c-wai-au@w3.org
Message-Id: <3.0.3.32.19980513144811.00701e9c@pop.tiac.net>
Tables provide a two-dimensional space for organizing information.
Data content cells generally have header cells to provide the semantics
for the data cells.

HTML4.0 and CSS2 have addressed means to provide and exploit that 
organization.

For accessibility to tabular data where the tabular grid is missing
to the user, that user should be able to:

1. Recognize the table as such, and its use (layout or data)
2. Learn its structure, through metainformation or analysis
3. Navigate within the table with selective amount of generated coordinate 
   identification
4. Request full cell identification by applicable means, including
   parentage identification, current captions, applicable header 
   information (possibly shortened), and IDREFs to IDs of pertinent headers.

Tabular data may match the static paper publishing model, or may 
be the generated result of a query against a collection of data.

A "static table" is one that has "all information included" so the user
can select only that part of interest. A static table may be much 
larger than necessary. Its visual organization is self-describing.
Metainformation was not important, though it can be exploited if 
included.

A "generated table" is one that delivers current content. It may be in
response to a query. The table description and content may be reduced 
to contain only that which is pertinent to that query. It needn't have 
the same dimensionality as a static table that might contain it.

A style description for a static table may differ from that for a
generated table. A static stylesheet may need means to be specialized for
generated tables. 

I am currently checking the coverage of these issues by CSS2 and HTML4.0
Neither are complete.

---------------------------------------

Date: Sat, 03 May 1997 10:29:10 -0400
From: Harvey Bingham <hbingham@ACM.org>
Subject: Accessibility of Tables

I am collecting ideas for making tables more nearly accessible. 
This is an early task of the Web Accessibility Initiative (WAI) 
of the World Wide Web Consortium. I thank you for sharing your 
suggestions, experiences, and frustrations. The time is right 
for us to make positive contributions.

I have worked to specify SGML tables since 1989, for various SGML
activities,including the CALS table model, and more recently as 
chair or co-chair of the SGML Open technical tables subcommittee. 
We made no accommodation to accessibility in those efforts. I 
hope we can rectify that omission.

The International Committee for Accessible Document Design (ICADD)
approach for use with SGML-tagged tables is to allow  each table 
axis to be named, and each cell would contain the names of the set 
of axes that apply to it. This works well for simple tables with 
single initial row heads and initial column stubs. It can get messy if
spanning heads or stubs apply. By default the cell content of 
the initial row and  column could provide the axis information.  

The HTML table model -- is a  list of rows (TR). It distinguishes 
table cell kinds in a row as either head (TH) or  data (TD). For 
simplicity TH and TD may be arbitrarily intermixed in any row or 
column of any row. A clean table has TD for column heads in the 
initial row and  row stubs in the initial column. All rows have 
every cell filled.

ICADD support -- HTML 2.0 supported it. HTML 3.2 abandoned it,
as the principal browser vendors did not support it. The WAI 
will identify means to improve accessibility to web documents 
for future versions of HTML.

Table cell content -- may be simple, fitting into a single "line".
It may wrap onto several lines. A user agent should be able to 
detect the wrapping and deliver the entire cell content.

Table cell content -- may be complex, with substructure. Desirable 
for a user agent is to treat a cell as a "mini-document" so its 
structure would be handled as if it were a document to itself.

Axis/Axes identification -- one possible way to identify the axes 
appropriate to any cell would be to have two distinct axis lists 
to represent the two coordinates of a cell. At any time the 
axis lists could be used individually as a selector of row or 
column, or in combination to select a cell. Sequential reading 
of cells along a row would not necessarily access the axis lists, though the
reverse linking from cell to its axes should be possible. 

Spanning TH cells -- Table complexity results from several rows with 
some TH cells that horizontally span several columns. Presumably 
the next row would contain more TH that add additional subsidiary 
axis information. Analogously, several columns of initial TH cells 
some with vertical spanning to apply that axis information to the 
subsidiary row stubs.

Spanning TD cells -- normally have content in only the initial cell 
of the spanned rectangular region of cells. What axes information applies
from non-initial cell spaces? If some TH has the same
span, and there are distinct subsidiary TH, omit axis info from
the latter? If no TH has the same span, make composite of all
axis info for any spanned? Not clear at all.

TH anywhere -- the HTML table model permits TH cells anywhere
in a row, not just at the beginning (or end). There is no
obvious way to identify which axis or axes that cell might 
override for further use in subsequent cells of that row 
and/or column.

Table linearization -- for serial delivery most assume that the 
table is formed as an ordered list of rows, with each having 
cells in columns. With that ordering, it is awkward to read 
skipping down a column.

Non-linear positioning -- a user agent might take advantage of 
searching and non-linear positioning. A TD found this way
would like the option to "show the axes". The alternative 
"hide the axes information" unless requested is also useful.

Table as list -- One example from Recording for the Blind and 
Dyslexic is their text version of  "Getting Results with Microsoft 
Office for Windows 95". It only had to deal with simple 2 or at 
most 3-column tables, with an initial row of column headers TH, 
and no row stub contributing to subsequent cells of the row. 
The linear text of the translation included for each cell of 
each row the TH information from the initial row as prefix to 
the cell TD content. In that redundant form, much more than half 
the tables content is cell identification. Cell content that 
wrapped would just appear in linear order, so there was no need 
to recognize that condition. There were no more complex structures used in
cells.

Sequential presentation -- the cost of any non-linear positioning 
is high for an audio tape. Thus the redundant axis identifications 
might be desirable. In that case, there is some advantage to being 
able to enable or disable the axes information presented under 
user control. There are no marks in the text string to facilitate 
this recognition and skipping.

Coordinate enumeration along axis -- An effective technique 
borrowed from spreadsheets is to use ordered coordinate values 
to identify cells. The initial coordinate values of an axis are 
useful to retrieve the implicit axis information from those column 
heads or row stubs.

Partial table extraction -- with adequate table-reading tools, sub-tables
could be extracted, suppressing uninteresting data:

1)  compressed to show only selected columns (not necessarily adjacent)
2)  conditional selection  based on cell values
3)  transposed ordering among axes (possibly more than two)
4)  projections onto 2 dimensions from higher-dimensional data
      (possibly by selecting N-2 among N axes as fixed)

Table cell content -- may have substructure. The user agent should 
handle the substructure similarly to how that same structure 
would be handled if it weren't in a cell.

Cell boundaries --  may be deliberately visible or invisible to 
the sighted. When invisible, their presence is presumably evident 
by the regularity of the display grid for the cells. Alternative 
means to bound the cells are required for different representations. 

"Layout tables" -- The tables discussed above are "data tables", 
where axes provide identification for the cell content. HTML 
allows a cell to contain another table. HTML tables are currently 
being used as a "layout" hack in place of other styling tools.
The axes of these layout tables is "ancestry" that may or may 
not need to contribute to the global axes information for the 
cell in which there is a "data" table. The issue is similar if 
frames are used for different tiled regions: should the frame 
identification be part of the ancestry. 

Multi-axis complications -- Not all cells need have the same 
number of axes. For example, some TH columns may have spanning 
TH, others need not. Cell axes needn't describe Euclidean 
coordinate space, even though all cells have the same number 
of axes.

Limit to complexity -- at what point does table complexity in 
web documents move into a different form for their expression, 
such as spreadsheet or data base?

Regards/Harvey Bingham
mailto:hbingham@acm.org

for the Yuri Rubinsky Insight Foundation
Received on Wednesday, 13 May 1998 14:50:15 UTC