- From: Justin James <j_james@mindspring.com>
- Date: Mon, 6 Jul 2009 16:00:36 -0400
- To: "'Shelley Powers'" <shelley.just@gmail.com>
- Cc: <public-html@w3.org>
> -----Original Message----- > From: Shelley Powers [mailto:shelley.just@gmail.com] > Sent: Monday, July 06, 2009 3:16 PM > To: Justin James > Cc: public-html@w3.org > Subject: Re: How to make complex data tables more accessible to screen- > reader users > > With all due respect to Jakob Neilsen, five people is too small a > sample. Not unless you're just proving your own beliefs, not > necessarily trying to find truth. >From the usability.gov Web site that you quote: "A typical range is from 8 to 16 (per user group) each test." Nielsen says 5 is really the minimum (which is the point I conveyed, I hope). Regardless, the point is, this does not need to be a huge deal. > > Here is how I would go about a proper test of the "usability" of the > HTML 5 > > spec regarding the @summary issue: > > > > 1. Put together a number of different spec texts regarding @summary, > each > > representing one of the many viewpoints on this subject. Put this > text > > within the context of a subsection of the HTML 5 table specs. So the > only > > difference between the table specs is the @summary information. > > > > 2. Gather a group of 5 - 10 volunteers for each of the proposed > specs. Each > > volunteer should be experienced with HTML 4, but have no connection > with > > this group (in other words, have none of the "context" around the > @summary > > discussion). > > > > 3. Present each volunteer with a set of 3 - 5 tasks, involving table > > creation. Give them an HTML template already made, so that all they > need to > > do is add the table. Each task should have the data that needs to go > into > > the table, and guidelines regarding captions/labels (but not > explicitly say, > > "use this caption"). For example: "Make a table showing the sales for > each > > of four different fruits. For each fruit, you know the name, country > of > > origin, number of pounds sold, price per pound, and total dollar > sales > > amount." That's it. No mention of accessibility, etc. Each volunteer > may use > > whatever tools they prefer (WYSIWYG editors, plain text editors, > etc.). > > > > 4. Examine the HTML created by each group, and see if the users > created a > > table that truly meets the accessibility needs. Was @summary > provided? Was > > it truly useful? What about other basic table elements, like <th>? > Were the > > used appropriately? > > > > 5. Follow up with specific questions, such as "after reading the spec > > provided, why did you chose to use @summary in the way that you did?" > or > > "what benefit do you expect @summary to provide to what kinds of > users, > > based on the spec that you read?" > > > > I think that this would take no more than 1 hour of each volunteer's > time, > > could be conducted remotely (via a screen share, Web conference, > phone call, > > etc.) and therefore not require special usability labs, > circumstances, etc. > > > > Again, too small a sample, self-selecting audience, no controls. There are plenty of controls (in this case, the information provided varies only with what is being studied and the tasks to be performed), the audience is going to be self-selecting no matter what (unless we somehow force people to participate), and the sample can be 10 people or 20 people or however many people. > A usability test typically is iterative, in that you develop a > hypothesis, create tests, obtain random sampling, run tests, discuss > results, fine tune hypothesis, run tests again, and so on. And the test I propose is merely iterative in a parallel fashion; it tests a number of different hypotheses ("this version of the text is the best") simultaneously. > I'm waiting to see if the W3C may have such a lab. I highly doubt it. From what I can tell, the W3C does not have that level of budget or facilities. Remember, they can't even pay someone to work on HTML 5 full time, and it is their "crown jewel" spec. > >> Accessing existing web pages is good for developing hypothesis, > makes > >> good anecdotal information, and can help determine where problems > >> might arise. But it can't usefully be used to provide a definitive > >> conclusion about what exactly is the root cause of the problems. > Why? > >> Because the data arises in the wild, not within a carefully > controlled > >> environment, where external factors can be filtered. > > > > At the same time, for our purposes, "in the wild" results are the > only ones > > that matter in the end. Ian's "dry science fiction" phrase, and all > of that. > > I disagree. I think you misunderstood me. When I said "in the wild results" there, I meant that the only thing that matters in the end is what authors do when the spec is released. If that was your understanding of my statement, and you still disagree, I would be fairly baffled. A spec which is not adhered to is pretty useless, regardless of the reason it is not used. > >> Sure, but I can't see this group being able to develop truly > effective > >> usability studies within the short time we have, and neither do I > see > >> any indication that we have the resources to conduct the proper > tests. > > > > I think the test I outlined above serves as a perfectly fine example. > It > > took me less than 10 minutes to devise and type up. > > And probably would only take about 10 minutes to tear the results > apart. Fine, then tear them apart here. Show why (and please be specific), this test is not good. In fact, do better. Present your own test. You keep saying "not good enough, not good enough" to everything presented on this topic. You have stated multiple times what standards you would hold a valid (in your mind) test to. Put your money where your mouth is. Present a test (assuming that W3C has a usability lab, and assuming that they do not have a usability lab) that meets your standards. Until you are willing to do so, I am not sure where else this conversation can go. > > > >> I may be wrong: chairs, does the W3C have a usability lab? > > > > You really do not need one for this circumstance. It is not like we > are > > doing eye tracking studies, need to record user interaction, time the > tasks, > > or any of the other items that a usability lab is used for. > > A lab, in this case, is more of an experience organization that is > willing to devise and operate tests outside of members of this group > and others who have a bias in the results. In which case, the W3C most likely does *not* have one. They should have people with usability experience participating in some of their efforts, but I highly doubt that they are organized into a group focused on usability. > Remember, we're talking about the future of the web, and the > accessibility of the web. I'm not willing to trust both to something > slapped quickly together. I don't think anyone really is. I would hope > not. The history of the Web consists almost entirely of duct tape, super glue, and coat hanger engineering, total seat of the pants stuff. That's the reason why we are in this bind to begin with. I am in complete agreement that we need something better than what we have. But if you really want to wait until we have something perfect, we will never have anything. I would rather see us do what *can* be done in the few months remaining until Last Call, than to do nothing. > >> Even if we create a few web pages and ask people to try them out or > >> try editing something and look at the results, we aren't conducting > >> the study in a controlled environment, nor are we including > safeguards > >> to ensure bias doesn't creep into the result. > > > > You are setting the bar not only far higher than it needs to be, but > at a > > level that actually invalidates the results. > > > > Again, reference above, future of the web, etc. etc. > > How am I invalidating the results? Without seeing your proposal for a test that you would accept, the appearance is that you want a test in an environment so "sterile" (free of outside factors) that it resembles in no way what actual HTML authors actually do. One of the oddities of testing, is that when you make it clear what is being tested, people change their behaviors. This is the fundamental problem with self-reporting ("how many times in the last year have you taken illegal drugs?") as well. In this case, if you make it clear that what you are studying is the usage @summary, users will be using it differently. Or if you control factors so much that the users are no comfortable, they won't work the way they normally would. > >> But again--and I'm not sure I'm saying this clearly, so inviting > >> others to rephrase or add to what I'm saying--we don't fully > >> understand the root causes of the data that we do see. > > > > I think that this is a fair assessment. The test I describe above > makes sure > > that the test subjects are fully informed regarding @summary, by > providing > > them with the relevant subset of the proposed HTML 5 spec, without > > highlighting @summary to the point where subjects know what it is we > are > > looking for. At the same time, "all else is equal" (other than their > work > > environments), so we will be able to easily determine why @summary > was or > > was not used as needed. > > > > Again, though, a test slapped together by members of this group, and > interpreted by same, is not going to cut it. We all have our biases > we're bringing to the table. We would need an impartial, capable > organization or group of people who not only know how to put together > the proper tests, but how to interpret the results in as unbiased a > manner as possible. I would love to see the same. The time constraints make this a bit tough. The budget constraints (as in, "what budget?") make this an impossibility, unless we can find a usability lab to donate their services. Until then, the best you will be able to see is a usability test generated by this group and performed on friends/colleagues of people on this list. > Again, I'm waiting to see if the W3C doesn't have access to this type > of group or organization. It does have ties to universities, where > such tests are typically run. The kind of testing you see in university settings is more of the eye tracking type of studies, that have applications and interest outside of product improvement, such as psychology and cognitive science. I may be mistaken of course. Even then, it would still require funding which this organization does not have. J.Ja
Received on Monday, 6 July 2009 20:01:52 UTC