Re: How to make complex data tables more accessible to screen-reader users from Shelley Powers on 2009-07-06 (public-html@w3.org from July 2009)

From: Shelley Powers <shelley.just@gmail.com>
Date: Mon, 6 Jul 2009 14:15:36 -0500
To: Justin James <j_james@mindspring.com>
Cc: public-html@w3.org
Message-ID: <643cc0270907061215p69d05087y541e37f7dca54d93@mail.gmail.com>
On Mon, Jul 6, 2009 at 1:08 PM, Justin James<j_james@mindspring.com> wrote:
>> -----Original Message-----
>> From: public-html-request@w3.org [mailto:public-html-request@w3.org] On
>> Behalf Of Shelley Powers
>> Sent: Monday, July 06, 2009 1:26 PM
>> To: Lachlan Hunt
>> Cc: Sam Ruby; public-html@w3.org; www-archive
>> Subject: Re: How to make complex data tables more accessible to screen-
>> reader users
>>
>> So my own hypothesis is that when it comes to the authors, focusing
>> purely on the markup is ineffective because any one of the solutions
>> doesn't address the real problems: confusion about what needs to be
>> provided, good, clear cut examples, and encouragement from the web
>> community at large, and increased encouragement from the accessibility
>> community.
>>
>> And again, the only way to really test this, objectively, is to get
>> volunteers from the web community at large, develop tests for each of
>> the approaches, have the people perform the tasks in controlled
>> environments, observe the behavior, the results, and also provide
>> questionnaires to test web authors preferences, understanding, etc.
>
> Usability expert Jakob Neilsen has said on quite a number of occasions that
> after doing tests on about 5 people, further tests refine the data, but the
> overall direction is already well known. I believe that tests that you
> describe do not need to be nearly as in depth, expensive, or as rigorous as
> you describe. Remember, we're studying human behavior, and "in the lab"
> behavior is much less important than actual "in the wild" behavior for our
> goals.
>

With all due respect to Jakob Neilsen, five people is too small a
sample. Not unless you're just proving your own beliefs, not
necessarily trying to find truth.

> Here is how I would go about a proper test of the "usability" of the HTML 5
> spec regarding the @summary issue:
>
> 1. Put together a number of different spec texts regarding @summary, each
> representing one of the many viewpoints on this subject. Put this text
> within the context of a subsection of the HTML 5 table specs. So the only
> difference between the table specs is the @summary information.
>
> 2. Gather a group of 5 - 10 volunteers for each of the proposed specs. Each
> volunteer should be experienced with HTML 4, but have no connection with
> this group (in other words, have none of the "context" around the @summary
> discussion).
>
> 3. Present each volunteer with a set of 3 - 5 tasks, involving table
> creation. Give them an HTML template already made, so that all they need to
> do is add the table. Each task should have the data that needs to go into
> the table, and guidelines regarding captions/labels (but not explicitly say,
> "use this caption"). For example: "Make a table showing the sales for each
> of four different fruits. For each fruit, you know the name, country of
> origin, number of pounds sold, price per pound, and total dollar sales
> amount." That's it. No mention of accessibility, etc. Each volunteer may use
> whatever tools they prefer (WYSIWYG editors, plain text editors, etc.).
>
> 4. Examine the HTML created by each group, and see if the users created a
> table that truly meets the accessibility needs. Was @summary provided? Was
> it truly useful? What about other basic table elements, like <th>? Were the
> used appropriately?
>
> 5. Follow up with specific questions, such as "after reading the spec
> provided, why did you chose to use @summary in the way that you did?" or
> "what benefit do you expect @summary to provide to what kinds of users,
> based on the spec that you read?"
>
> I think that this would take no more than 1 hour of each volunteer's time,
> could be conducted remotely (via a screen share, Web conference, phone call,
> etc.) and therefore not require special usability labs, circumstances, etc.
>

Again, too small a sample, self-selecting audience, no controls.

A usability test typically is iterative, in that you develop a
hypothesis, create tests, obtain random sampling, run tests, discuss
results, fine tune hypothesis, run tests again, and so on.

You don't have to take my word, you can read the government guidelines
on usability testing at http://www.usability.gov/refine/learnusa.html.

I'm waiting to see if the W3C may have such a lab.

If they do, then we have establish what are the parameters we're
testing, create tests for each, determine how to get a good,
representative sampling of the audience, and that includes both author
and user, and then derive appropriately unbiased questionnaires. And
then be prepared to iterate through the tests multiple times.

If this were easy, usability experts would not be paid the big bucks.

>> Accessing existing web pages is good for developing hypothesis, makes
>> good anecdotal information, and can help determine where problems
>> might arise. But it can't usefully be used to provide a definitive
>> conclusion about what exactly is the root cause of the problems. Why?
>> Because the data arises in the wild, not within a carefully controlled
>> environment, where external factors can be filtered.
>
> At the same time, for our purposes, "in the wild" results are the only ones
> that matter in the end. Ian's "dry science fiction" phrase, and all of that.

I disagree.

>
>> Sure, but I can't see this group being able to develop truly effective
>> usability studies within the short time we have, and neither do I see
>> any indication that we have the resources to conduct the proper tests.
>
> I think the test I outlined above serves as a perfectly fine example. It
> took me less than 10 minutes to devise and type up.

And probably would only take about 10 minutes to tear the results apart.

>
>> I may be wrong: chairs, does the W3C have a usability lab?
>
> You really do not need one for this circumstance. It is not like we are
> doing eye tracking studies, need to record user interaction, time the tasks,
> or any of the other items that a usability lab is used for.

A lab, in this case, is more of an experience organization that is
willing to devise and operate tests outside of members of this group
and others who have a bias in the results.

Remember, we're talking about the future of the web, and the
accessibility of the web. I'm not willing to trust both to something
slapped quickly together. I don't think anyone really is. I would hope
not.


>
>> But you can't filter out all of the extraneous environmental factors
>> when you're working with data in the wild.
>
> I agree that you need to control as many factors as possible (which the test
> I outline above does). At the same time, you need to study the way people
> *actually* work, which means at their own desk on their own equipment, using
> the workflow that they feel comfortable with. That's how a "listening lab"
> works, instead of given the user precise instructions ("use the search
> feature to find out the company's yearly revenues"), you provide the users
> with a goal ("find the company's yearly revenues") and study how they go
> about the task, listening to their narrative as they explain why they are
> doing what they are doing.
>

I agree, especially when doing user testing.

>> Even if we create a few web pages and ask people to try them out or
>> try editing something and look at the results, we aren't conducting
>> the study in a controlled environment, nor are we including safeguards
>> to ensure bias doesn't creep into the result.
>
> You are setting the bar not only far higher than it needs to be, but at a
> level that actually invalidates the results.
>

Again, reference above, future of the web, etc. etc.

How am I invalidating the results?


>> Normally, I don't think we would need this level of rigor, but when
>> you have such a strong disagreement in the group, you have to go the
>> extra distance to ensure truly comprehensive and inclusive results.
>
> I agree, if for nothing else than to make sure that if the debate is not
> actually ended, there is a set of data which we can at least all agree is
> valid for discussion.
>

Sure.

>> But again--and I'm not sure I'm saying this clearly, so inviting
>> others to rephrase or add to what I'm saying--we don't fully
>> understand the root causes of the data that we do see.
>
> I think that this is a fair assessment. The test I describe above makes sure
> that the test subjects are fully informed regarding @summary, by providing
> them with the relevant subset of the proposed HTML 5 spec, without
> highlighting @summary to the point where subjects know what it is we are
> looking for. At the same time, "all else is equal" (other than their work
> environments), so we will be able to easily determine why @summary was or
> was not used as needed.
>

Again, though, a test slapped together by members of this group, and
interpreted by same, is not going to cut it. We all have our biases
we're bringing to the table. We would need an impartial, capable
organization or group of people who not only know how to put together
the proper tests, but how to interpret the results in as unbiased a
manner as possible.

Again, I'm waiting to see if the W3C doesn't have access to this type
of group or organization. It does have ties to universities, where
such tests are typically run.

> J.Ja
>
>

Shelley

>
Received on Monday, 6 July 2009 19:16:16 UTC