Re: How to make complex data tables more accessible to screen-reader users from Shelley Powers on 2009-07-06 (public-html@w3.org from July 2009)

From: Shelley Powers <shelley.just@gmail.com>
Date: Mon, 6 Jul 2009 15:25:30 -0500
To: Justin James <j_james@mindspring.com>
Cc: public-html@w3.org
Message-ID: <643cc0270907061325v4b363223t392912591c2d705c@mail.gmail.com>
>> With all due respect to Jakob Neilsen, five people is too small a
>> sample. Not unless you're just proving your own beliefs, not
>> necessarily trying to find truth.
>
> From the usability.gov Web site that you quote:
> "A typical range is from 8 to 16 (per user group) each test." Nielsen says 5
> is really the minimum (which is the point I conveyed, I hope). Regardless,
> the point is, this does not need to be a huge deal.

And if Dr. Nielsen derived and conducted the test, perhaps 5 authors
and 5 users would be sufficient. Perhaps.

But none of us has Dr. Nielsen's experience.

>
>> > Here is how I would go about a proper test of the "usability" of the
>> HTML 5
>> > spec regarding the @summary issue:
>> >
>> > 1. Put together a number of different spec texts regarding @summary,
>> each
>> > representing one of the many viewpoints on this subject. Put this
>> text
>> > within the context of a subsection of the HTML 5 table specs. So the
>> only
>> > difference between the table specs is the @summary information.
>> >
>> > 2. Gather a group of 5 - 10 volunteers for each of the proposed
>> specs. Each
>> > volunteer should be experienced with HTML 4, but have no connection
>> with
>> > this group (in other words, have none of the "context" around the
>> @summary
>> > discussion).
>> >
>> > 3. Present each volunteer with a set of 3 - 5 tasks, involving table
>> > creation. Give them an HTML template already made, so that all they
>> need to
>> > do is add the table. Each task should have the data that needs to go
>> into
>> > the table, and guidelines regarding captions/labels (but not
>> explicitly say,
>> > "use this caption"). For example: "Make a table showing the sales for
>> each
>> > of four different fruits. For each fruit, you know the name, country
>> of
>> > origin, number of pounds sold, price per pound, and total dollar
>> sales
>> > amount." That's it. No mention of accessibility, etc. Each volunteer
>> may use
>> > whatever tools they prefer (WYSIWYG editors, plain text editors,
>> etc.).
>> >
>> > 4. Examine the HTML created by each group, and see if the users
>> created a
>> > table that truly meets the accessibility needs. Was @summary
>> provided? Was
>> > it truly useful? What about other basic table elements, like <th>?
>> Were the
>> > used appropriately?
>> >
>> > 5. Follow up with specific questions, such as "after reading the spec
>> > provided, why did you chose to use @summary in the way that you did?"
>> or
>> > "what benefit do you expect @summary to provide to what kinds of
>> users,
>> > based on the spec that you read?"
>> >
>> > I think that this would take no more than 1 hour of each volunteer's
>> time,
>> > could be conducted remotely (via a screen share, Web conference,
>> phone call,
>> > etc.) and therefore not require special usability labs,
>> circumstances, etc.
>> >
>>
>> Again, too small a sample, self-selecting audience, no controls.
>
> There are plenty of controls (in this case, the information provided varies
> only with what is being studied and the tasks to be performed), the audience
> is going to be self-selecting no matter what (unless we somehow force people
> to participate), and the sample can be 10 people or 20 people or however
> many people.
>

If the people were recruited based on their member in other efforts,
such as students get recruited at university all the time, and they're
only asked if they would be willing to participate in _a_ study or
test, without prior knowledge of what the test was about, then the
sampling becomes less self-selecting.


>> A usability test typically is iterative, in that you develop a
>> hypothesis, create tests, obtain random sampling, run tests, discuss
>> results, fine tune hypothesis, run tests again, and so on.
>
> And the test I propose is merely iterative in a parallel fashion; it tests a
> number of different hypotheses ("this version of the text is the best")
> simultaneously.
>

That's what I meant by iterative.

>> I'm waiting to see if the W3C may have such a lab.
>
> I highly doubt it. From what I can tell, the W3C does not have that level of
> budget or facilities. Remember, they can't even pay someone to work on HTML
> 5 full time, and it is their "crown jewel" spec.
>

The W3C has responded, it doesn't have the facilities.

>> >> Accessing existing web pages is good for developing hypothesis,
>> makes
>> >> good anecdotal information, and can help determine where problems
>> >> might arise. But it can't usefully be used to provide a definitive
>> >> conclusion about what exactly is the root cause of the problems.
>> Why?
>> >> Because the data arises in the wild, not within a carefully
>> controlled
>> >> environment, where external factors can be filtered.
>> >
>> > At the same time, for our purposes, "in the wild" results are the
>> only ones
>> > that matter in the end. Ian's "dry science fiction" phrase, and all
>> of that.
>>
>> I disagree.
>
> I think you misunderstood me. When I said "in the wild results" there, I
> meant that the only thing that matters in the end is what authors do when
> the spec is released. If that was your understanding of my statement, and
> you still disagree, I would be fairly baffled. A spec which is not adhered
> to is pretty useless, regardless of the reason it is not used.
>

No, that's not the only thing that matters.

The results have to be usable to the audience to which they're
targeted. It doesn't matter if authors like or don't like something,
if no one benefits from their work.

There are two consumer groups of HTML5 -- producers and consumers.
There are then variations within both groups.

>> >> Sure, but I can't see this group being able to develop truly
>> effective
>> >> usability studies within the short time we have, and neither do I
>> see
>> >> any indication that we have the resources to conduct the proper
>> tests.
>> >
>> > I think the test I outlined above serves as a perfectly fine example.
>> It
>> > took me less than 10 minutes to devise and type up.
>>
>> And probably would only take about 10 minutes to tear the results
>> apart.
>
> Fine, then tear them apart here. Show why (and please be specific), this
> test is not good.
>
> In fact, do better. Present your own test. You keep saying "not good enough,
> not good enough" to everything presented on this topic. You have stated
> multiple times what standards you would hold a valid (in your mind) test to.
> Put your money where your mouth is. Present a test (assuming that W3C has a
> usability lab, and assuming that they do not have a usability lab) that
> meets your standards. Until you are willing to do so, I am not sure where
> else this conversation can go.
>

I believe I have proposed how tests could be run in such a way that
there's no doubt about the validity of the results. I also believe it
will take more than ten minutes to derive such.

>> >
>> >> I may be wrong: chairs, does the W3C have a usability lab?
>> >
>> > You really do not need one for this circumstance. It is not like we
>> are
>> > doing eye tracking studies, need to record user interaction, time the
>> tasks,
>> > or any of the other items that a usability lab is used for.
>>
>> A lab, in this case, is more of an experience organization that is
>> willing to devise and operate tests outside of members of this group
>> and others who have a bias in the results.
>
> In which case, the W3C most likely does *not* have one. They should have
> people with usability experience participating in some of their efforts, but
> I highly doubt that they are organized into a group focused on usability.
>
>> Remember, we're talking about the future of the web, and the
>> accessibility of the web. I'm not willing to trust both to something
>> slapped quickly together. I don't think anyone really is. I would hope
>> not.
>
> The history of the Web consists almost entirely of duct tape, super glue,
> and coat hanger engineering, total seat of the pants stuff. That's the
> reason why we are in this bind to begin with. I am in complete agreement
> that we need something better than what we have. But if you really want to
> wait until we have something perfect, we will never have anything. I would
> rather see us do what *can* be done in the few months remaining until Last
> Call, than to do nothing.
>

I think that you've gone off on a tangent here. I don't think anyone
thinks summary is the 'perfect' solution. I believe the PF group and
others have asked that it be retained until something better is
proposed. I don't want to speak for the folks, but I don't think
anyone is talking perfection here, just what we can manage considering
last call by this fall.

>From my reading, the accessibility folks do not believe making summary
non-conforming and telling people to put what was summary into caption
is a better approach than the two separate text containers we have
now. I concur, specifically because it limits people's choices, and it
redefines caption. I think it is a wrong approach from two different
points of view: the loss of choice, and the redefinition of an
existing, long term element.

Ian has documented several different approaches in the HTML 5
document. If we were to add to the list @summary, as a conforming
attribute, and ensure proper documentation of all approaches (and a
couple of good examples), we could, I believe, reach consensus. We
would be giving authors options, and we wouldn't be adding any new
burden to any user agent, I don't believe.

I believe that the PF group even provided the text to document
@summary in the spec, and Josh provided an excellent example from
Juicy Studio.

So, no, I'm not advocating doing nothing.


>> >> Even if we create a few web pages and ask people to try them out or
>> >> try editing something and look at the results, we aren't conducting
>> >> the study in a controlled environment, nor are we including
>> safeguards
>> >> to ensure bias doesn't creep into the result.
>> >
>> > You are setting the bar not only far higher than it needs to be, but
>> at a
>> > level that actually invalidates the results.
>> >
>>
>> Again, reference above, future of the web, etc. etc.
>>
>> How am I invalidating the results?
>
> Without seeing your proposal for a test that you would accept, the
> appearance is that you want a test in an environment so "sterile" (free of
> outside factors) that it resembles in no way what actual HTML authors
> actually do. One of the oddities of testing, is that when you make it clear
> what is being tested, people change their behaviors. This is the fundamental
> problem with self-reporting ("how many times in the last year have you taken
> illegal drugs?") as well. In this case, if you make it clear that what you
> are studying is the usage @summary, users will be using it differently. Or
> if you control factors so much that the users are no comfortable, they won't
> work the way they normally would.
>

I don't believe I advocated a sterile environment, only an environment
free from biasing influences.

>> >> But again--and I'm not sure I'm saying this clearly, so inviting
>> >> others to rephrase or add to what I'm saying--we don't fully
>> >> understand the root causes of the data that we do see.
>> >
>> > I think that this is a fair assessment. The test I describe above
>> makes sure
>> > that the test subjects are fully informed regarding @summary, by
>> providing
>> > them with the relevant subset of the proposed HTML 5 spec, without
>> > highlighting @summary to the point where subjects know what it is we
>> are
>> > looking for. At the same time, "all else is equal" (other than their
>> work
>> > environments), so we will be able to easily determine why @summary
>> was or
>> > was not used as needed.
>> >
>>
>> Again, though, a test slapped together by members of this group, and
>> interpreted by same, is not going to cut it. We all have our biases
>> we're bringing to the table. We would need an impartial, capable
>> organization or group of people who not only know how to put together
>> the proper tests, but how to interpret the results in as unbiased a
>> manner as possible.
>
> I would love to see the same. The time constraints make this a bit tough.
> The budget constraints (as in, "what budget?") make this an impossibility,
> unless we can find a usability lab to donate their services. Until then, the
> best you will be able to see is a usability test generated by this group and
> performed on friends/colleagues of people on this list.
>

I believe I have stated this earlier. And no, I will not accept an
informal test conducted by people in this group.


>> Again, I'm waiting to see if the W3C doesn't have access to this type
>> of group or organization. It does have ties to universities, where
>> such tests are typically run.
>
> The kind of testing you see in university settings is more of the eye
> tracking type of studies, that have applications and interest outside of
> product improvement, such as psychology and cognitive science. I may be
> mistaken of course. Even then, it would still require funding which this
> organization does not have.

I've been involved in these tests in university, and they go beyond
just tracking eye movement.

Agree, this organization does not have funding. I believe, then, that
we can put aside discussions about testing, and focus, instead on
compromise that we can all agree to.


>
> J.Ja
>
>

Shelley
Received on Monday, 6 July 2009 20:26:11 UTC