- From: Gregg Vanderheiden <gv@trace.wisc.edu>
- Date: Mon, 3 May 2004 10:51:20 -0500
- To: <w3c-wai-gl@w3.org>
Hi John, all, Two points. The question isn't whether it is hard to get agreement at the far ends. It would be easy to get people to agree on really bad and really good text. But the majority of content falls into the category in the middle. And there I think we would get lots of disagreement. Also, it is hard to determine for 'whom' it is supposed to be easy to read. It would be hard to get agreement even on this I'm afraid. If this is accessibility - one would presume it meant easy to read for people with disabilities. But which ones do we include and which do we exclude.... This is, I think part of our problem. It is so hard for any of us to draw a line -- so we want to say - "Do as much as you can" but that isn't testable. The best we have come up with is "Consider all these strategies" which is basically the same thing. We need to talk in specifics I think - since most of us want to endorse generalities that end up being contradictory when we get to the specifics. Gregg -- ------------------------------ Gregg C Vanderheiden Ph.D. Professor - Ind. Engr. & BioMed Engr. Director - Trace R & D Center University of Wisconsin-Madison -----Original Message----- From: w3c-wai-gl-request@w3.org [mailto:w3c-wai-gl-request@w3.org] On Behalf Of John M Slatin Sent: Monday, May 03, 2004 8:57 AM To: Charles McCathieNevile; Gregg Vanderheiden Cc: w3c-wai-gl@w3.org Subject: RE: Definition of human testability I think Gregg's assertion that it would be impossible for a group users to agree whether a particular text was "written clearly" or not is incorrect. It might be difficult to get 100% agreement among a large group of reviewers. But it might well be possible to get 80% or even 90% agreement in some situations. There are various ways to doing such things. For example, a group of readers might be brought together and asked to rate a series of documents for clrity, on a scale of 1 (very unclear) to 5 (very clear). In the first round, I would expect substantial disagreement, especially concerning longer and more complex documents. Over time, and after discussion in which the raters talked with each other about what they were looking for, I would expect increasing agreement to emerge, even about the longer and more complex documents. There are a number of situations in which this sort of rating takes place. Portfolio-based assessment of student learning, for example, depends heavily on the ability of multiple readers to agree on evidence of learning demonstrated across a portfolio of student work. My colleague, Peg Syverson, maintains an excellent Web site about the Learning Record Online [1] that includes the complex scales used to represent learning across five dimensions, as well as a very comprehensive bibliography. In the US, students seeking admission to university take a standardized examination that includes a written component. The essays that students produce in response to exam prompts are read by people who've been trained in the "holistic" scoring techniques I mentioned. John [1] http://www.cwrl.utexas.edu/~syverson/olr/evaluation.html "Good design is accessible design." Please note our new name and URL! John Slatin, Ph.D. Director, Accessibility Institute University of Texas at Austin FAC 248C 1 University Station G9600 Austin, TX 78712 ph 512-495-4288, f 512-495-4524 email jslatin@mail.utexas.edu web http://www.utexas.edu/research/accessibility/ -----Original Message----- From: w3c-wai-gl-request@w3.org [mailto:w3c-wai-gl-request@w3.org] On Behalf Of Charles McCathieNevile Sent: Sunday, May 02, 2004 1:56 am To: Gregg Vanderheiden Cc: w3c-wai-gl@w3.org Subject: RE: Definition of human testability Sorry, my point was more subtle. I understand the general principle of agreeing on the results of tests. (This is why I participate in EuroAccessibility under my Sidar hat - Sidar thinks it is very important that we clarify and agree on how to test WCAG 1, at least across Europe). But we often seem to be saying "for things that aren't machine testable, we want to get consistent results from people". And I just wanted to clarify that we expect all tests to be testable by people, with consistent results, including those that are often automated. For example, validation of HTML and XHTML code is something that machines do more efficiently than people, and on average more accurately. But there are bugs from time to time in validators, which a group of people who know the relevant specification can all identify. In such a case, I believe we want to say that the people are right and the machine test is wrong. Otherwise it will be necessary to identify the particular machine tests we trust, which I think will add about 2 years to the timeline... cheers Chaals On Sat, 1 May 2004, Gregg Vanderheiden wrote: >I'm not following you Charles. > >What this says - is that all success criteria must be reliably >testable. That is, we can't have success criteria like "Write clearly" since 10 users >would differ on what constituted 'clearly'. The test cannot be more >specific than the guideline, so all the testers could go on was their >own training for what constituted 'clearly'. > >NOTE: that it is not yet clear whether all of the SC we have are >specific enough to be reliably testable without referring to technology >specific checklists. But that is another discussion. Hopefully we can >make them specific enough in the doc. What the consensus is though - >is just that we should not have anything listed in the SC category that >an author cannot reliably determine (or have determined) that they have >met. > >Make sense now? If not - then we need to figure out how to word it >better. > > >Gregg > > -- ------------------------------ >Gregg C Vanderheiden Ph.D. >Professor - Ind. Engr. & BioMed Engr. >Director - Trace R & D Center >University of Wisconsin-Madison > > >-----Original Message----- >From: Charles McCathieNevile [mailto:charles@w3.org] >Sent: Saturday, May 01, 2004 12:41 PM >To: Gregg Vanderheiden >Cc: w3c-wai-gl@w3.org >Subject: RE: Definition of human testability > >This seems backwards. Presumably we believe that all tests will produce >consistent results when done by reasonably knowledgeable people, with >some of them also being sufficiently simple to automate completely. > >Otherwise we have no basis for deciding whether a particular test that >a tool does is in fact a valid one or not, and in the case of two >conflicting results from tools we would not have any way of declaring >which was accurate... > >cheers > >Chaals > >On Thu, 29 Apr 2004, Gregg Vanderheiden wrote: > >>Yes >> >>That is what is intended I believe. >> >>Your alternative wording #1 is closest. The word "certain" isn't quite >>right since it would apply to all of the non-machine testable items so it >>would become >> >> >> >>1. In the judgment of the working group members, the success criteria >>that are not machine testable can be tested by humans in a manner that >>is >capable >>of yielding consistent results among multiple knowledgeable testers. >> >> >> >> >>Gregg >> >> -- ------------------------------ >>Gregg C Vanderheiden Ph.D. >>Professor - Ind. Engr. & BioMed Engr. >>Director - Trace R & D Center >>University of Wisconsin-Madison >> >> _____ >> >>From: w3c-wai-gl-request@w3.org [mailto:w3c-wai-gl-request@w3.org] On >Behalf >>Of Sailesh Panchang >>Sent: Thursday, April 29, 2004 11:22 AM >>To: w3c-wai-gl@w3.org >>Subject: Definition of human testability >> >> >> >> Present draft: "Success criteria for all levels would be testable. >>Some success criteria may be machine-testable. Others may require >>human judgment. Success criteria that require human testing would, in >>the judgment of the working group members, yield consistent results >>among multiple knowledgeable testers." >>Comment: >>Wording of the last sentence is confusing. I believe what is meant is: >>"Judgment of the working group members" applies to identification of >>criteria that can be tested with consistency and reliability by humans. >>Right? >>Do we intend to list these tests? >>Consider following alternatives: >>1. In the judgment of the working group members, certain success criteria >>can be tested by humans in a manner that is capable of yielding consistent >>results among multiple knowledgeable testers. >> >> >> >>2. Claims of conformance to success criteria can be based on human >>testing is such testing has yielded or is capable of yielding >>consistent results among multiple knowledgeable testers. >> >>Sailesh Panchang >> >>Senior Accessibility Engineer >>Deque Systems,11180 Sunrise Valley Drive, >>4th Floor, Reston VA 20191 >>Tel: 703-225-0380 Extension 105 >>E-mail: sailesh.panchang@deque.com >>Fax: 703-225-0387 >>* Look up <http://www.deque.com> * >> >> >> >> >> >> >> > >Charles McCathieNevile http://www.w3.org/People/Charles tel: +61 409 >134 136 >SWAD-E http://www.w3.org/2001/sw/Europe fax(france): +33 4 92 38 78 >22 > Post: 21 Mitchell street, FOOTSCRAY Vic 3011, Australia or > W3C, 2004 Route des Lucioles, 06902 Sophia Antipolis Cedex, France > > Charles McCathieNevile http://www.w3.org/People/Charles tel: +61 409 134 136 SWAD-E http://www.w3.org/2001/sw/Europe fax(france): +33 4 92 38 78 22 Post: 21 Mitchell street, FOOTSCRAY Vic 3011, Australia or W3C, 2004 Route des Lucioles, 06902 Sophia Antipolis Cedex, France
Received on Monday, 3 May 2004 11:54:19 UTC