Re: Scientific Studies of the Summary Attribute (was: Why I don't attend the weekly teleconference) from Lachlan Hunt on 2009-06-30 (public-html@w3.org from June 2009)

From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
Date: Tue, 30 Jun 2009 20:16:55 +0200
To: Shelley Powers <shelley.just@gmail.com>
Cc: Ian Hickson <ian@hixie.ch>, public-html@w3.org
Message-ID: <4A4A5697.4060606@lachy.id.au>
Shelley Powers wrote:
>> There have been a number of public studies also. I believe in fact that in
>> the case of summary="" the only numbers that we have published were based
>> on publicly verifiable studies...
>
> Actually, those are not "studies", Ian. You and Philip accessed some
> publicly accessing information found online, and ran some queries and
> look at the data, and then formed your conclusions.

In the interest of moving this discussion forward, rather than 
continuing to endlessly debate the usefullness of the observational data 
that has so far been gathered and presented, it would be useful to 
instead focus on how we can perform proper scientific studies of data 
that we can all agree on.

The problem so far seems to be that different groups of people look at 
the current data we have and interpret it in vastly different ways.  It 
is my hope that we can discuss and come up with an agreed upon 
methodology for collecting, studying and analysing the data and 
ultimately agreeing upon how to interpret the results.

To get started, let's begin with the following hypothesis, that we can 
then break down into testable assertions, collect and analyse data, and 
then evaluate the result:

   Table summaries provided for non-layout tables using the summary
   attribute are useful in practice for users of assistive technology
   on a significant proportion of web sites, and such summaries would
   not significantly benefit users without assistive technology.

We could just as well start with the hypothesis that it's not useful, 
but I went with the affirmative viewpoint so as to not appear as though 
I'm trying to sway the outcome with my own personal bias.  Either way, 
the hypothesis will be either confirmed or rejected based on the study. 
  If necessary, we can of course refine the hypothesis.

 From that, there are a number of individual testable assertions that 
can be derived and investigated:

1. It is possible to algorithmically detect layout tables as a means of
    filtering out useless summary attributes that can be automatically
    ignored by assistive technology.

2. In practice, a significant proportion of pages that provide
    summaries on non-layout tables do so with values that are generally
    useful to users of assistive technology that exposes the values.

3. Such summaries are generally not useful and do not contain essential
    information for users without assistive technology, and thus little
    would be gained by moving the summary from the attribute into the
    surrounding prose where it's available to everyone.

(There may be other testable assertions that I missed)

To determine which of those assertions are true and which are false, we 
need to gather data and analyse it.  But first, we need to come to some 
agreement on how to proceed with collecting a representative sample of 
tables to study, but there are sure to be disagreements about which 
pages the sample should include.

For example, should we cover a broad spectrum of sites randomly selected 
from any and all publicly accessible web sites; or should we restrict 
the study to specific genres of sites, such as educational websites, 
government websites, environmental reporting sites (e.g. meteorological, 
geological, etc.), economic sites, or other.

Each of those genres I listed are likely to publish various forms of 
data tables, and I suspect it would be benefitial to look at those in 
isolation from other sites that are less likely to include data tables, 
like general news and blogs, social networking sites, etc.  The question 
is which groups of sites should we study.

Once we have come to an agreement on that, and collected the data, we 
will then need to sort out how exactly to analyse the raw data.  This 
could involve, for instance, manually going through each of the pages 
and rating the usefulness of their summaries for various user groups, 
perhaps using some kind of likert scale.  Or it could involve performing 
usability studies with groups of users with and without assitive technology.

But rather than me trying to sort out all the details myself, I'd like 
to see if others are interested in this and encourage others to 
contribute constructively.

-- 
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Received on Tuesday, 30 June 2009 18:17:40 UTC