- From: Gregg Vanderheiden <GV@TRACE.WISC.EDU>
- Date: Tue, 4 Dec 2001 17:21:29 -0600
- To: "'Charles McCathieNevile'" <charles@w3.org>
- Cc: "'GLWAI Guidelines WG \(GL - WAI Guidelines WG\)'" <w3c-wai-gl@w3.org>
Absolutely. Test cases (both selected and random) need to be a key part of our evaluation process. In fact, procedure I think you are suggesting is just what has been discussed though not formalized. So let's take this opportunity to begin that process. Let me pose the following to begin discussion. 1 - create a collection of representative (as much as there is such a thing) pages or sites that sample the RANGE of different pages, approaches and technologies on the Web. 2 - look at the items (particularly success criteria) - identify any additional sample pages or sites needed to explore the item (if sample is not good enough to) 3 - run quick tests by team members with these stimuli to see if agreement. If team agrees that it fails, work on it. If it passes team or is ambiguous then test move on to testing with external sample of people while fixing any problems identified in the internal screening test. 4 - proceed in this manner to keep improving items and learning about objectivity or agreement as we move toward the final version and final testing. 5 - in parallel with the above, keep looking at the items with the knowledge we acquire and work to make items stronger The key to this is the Test Case Page Collection. We have talked about this. But no one has stepped forward to help build it. Can we form a side team to work on this? NOTE: the above is a VERY rough description of a procedure as I run to a meeting. But I would like to see if we can get this ball rolling. Comments and suggestions welcome. Gregg -- ------------------------------ Gregg C Vanderheiden Ph.D. Professor - Human Factors Dept of Ind. Engr. - U of Wis. Director - Trace R & D Center Gv@trace.wisc.edu <mailto:Gv@trace.wisc.edu>, <http://trace.wisc.edu/> FAX 608/262-8848 For a list of our listserves send “lists” to listproc@trace.wisc.edu <mailto:listproc@trace.wisc.edu> -----Original Message----- From: Charles McCathieNevile [mailto:charles@w3.org] Subject: Re: "objective" clarified <snip> I think that for an initial assessment the threshold of 80% is fine, and I think that as we get closer to making this a final version we should be lifting that requirement to about 90 or 95%. However, I don't think that it is very useful to think about whether people would agree in the absence of test cases. There are some things where it is easy to describe the test in operational terms. There are others where it is difficult to descibe the test in operational terms, but it is easy to get substantial agreement. (The famous "I don't know how to define illustration, but I recognise it when I see it" explanation). It seems to me that the time spent in trying to imagine whether we would agree on a test would be more usefully spent in generating test cases, which we can thenuse to very quickly find out if we agree or not. The added value is that we then have those available as examples to show people - when it comes to people being knowledgeable of the tests and techniques they will have the head start of having seen real examples and what the working group thought about them as an extra guide. <snip>
Received on Tuesday, 4 December 2001 18:28:48 UTC