W3C home > Mailing lists > Public > w3c-wai-gl@w3.org > April to June 2004

RE: Definition of human testability

From: John M Slatin <john_slatin@austin.utexas.edu>
Date: Mon, 3 May 2004 08:57:28 -0500
Message-ID: <C46A1118E0262B47BD5C202DA2490D1A1E3122@MAIL02.austin.utexas.edu>
To: "Charles McCathieNevile" <charles@w3.org>, "Gregg Vanderheiden" <gv@trace.wisc.edu>
Cc: <w3c-wai-gl@w3.org>

I think Gregg's assertion that it would be impossible for a group users
to agree whether a particular text was "written clearly" or not is

It might be difficult to get 100% agreement among a large group of
reviewers.  But it might well be possible to get 80% or even 90%
agreement in some situations.

There are various ways to doing such things.  For example, a group of
readers might be brought together and asked to rate a series of
documents for clrity, on a scale of 1 (very unclear) to 5 (very clear).
In the first round, I would expect substantial disagreement, especially
concerning longer and more complex documents.  Over time, and after
discussion in which the raters talked with each other about what they
were looking for, I would expect increasing agreement to emerge, even
about the longer and more complex documents.

There are a number of situations in which this sort of rating takes
place.  Portfolio-based assessment of student learning, for example,
depends heavily on the ability of multiple readers to agree on evidence
of learning demonstrated across a portfolio of student work. My
colleague, Peg Syverson, maintains an excellent Web site about the
Learning Record Online [1] that includes the complex scales used to
represent learning across five dimensions, as well as a very
comprehensive bibliography.

In the US, students seeking admission to university take a standardized
examination that includes a written component. The essays that students
produce in response to exam prompts are read by people who've been
trained in the "holistic" scoring techniques I mentioned.

[1] http://www.cwrl.utexas.edu/~syverson/olr/evaluation.html

"Good design is accessible design." 
Please note our new name and URL!
John Slatin, Ph.D.
Director, Accessibility Institute
University of Texas at Austin
FAC 248C
1 University Station G9600
Austin, TX 78712
ph 512-495-4288, f 512-495-4524
email jslatin@mail.utexas.edu
web http://www.utexas.edu/research/accessibility/


-----Original Message-----
From: w3c-wai-gl-request@w3.org [mailto:w3c-wai-gl-request@w3.org] On
Behalf Of Charles McCathieNevile
Sent: Sunday, May 02, 2004 1:56 am
To: Gregg Vanderheiden
Cc: w3c-wai-gl@w3.org
Subject: RE: Definition of human testability

Sorry, my point was more subtle. I understand the general principle of
agreeing on the results of tests. (This is why I participate in
EuroAccessibility under my Sidar hat - Sidar thinks it is very important
that we clarify and agree on how to test WCAG 1, at least across

But we often seem to be saying "for things that aren't machine testable,
we want to get consistent results from people". And I just wanted to
clarify that we expect all tests to be testable by people, with
consistent results, including those that are often automated.

For example, validation of HTML and XHTML code is something that
machines do more efficiently than people, and on average more
accurately. But there are bugs from time to time in validators, which a
group of people who know the relevant specification can all identify. In
such a case, I believe we want to say that the people are right and the
machine test is wrong. Otherwise it will be necessary to identify the
particular machine tests we trust, which I think will add about 2 years
to the timeline...



On Sat, 1 May 2004, Gregg Vanderheiden wrote:

>I'm not following you Charles.
>What this says - is that all success criteria must be reliably 
>testable. That is, we can't have success criteria like "Write clearly"
since 10 users
>would differ on what constituted 'clearly'.     The test cannot be more
>specific than the guideline, so all the testers could go on was their 
>own training for what constituted 'clearly'.
>NOTE: that it is not yet clear whether all of the SC we have are 
>specific enough to be reliably testable without referring to technology

>specific checklists.  But that is another discussion.  Hopefully we can

>make them specific enough in the doc.  What the consensus is though - 
>is just that we should not have anything listed in the SC category that

>an author cannot reliably determine (or have determined) that they have

>Make sense now?  If not - then we need to figure out how to word it 
> -- ------------------------------
>Gregg C Vanderheiden Ph.D.
>Professor - Ind. Engr. & BioMed Engr.
>Director - Trace R & D Center
>University of Wisconsin-Madison
>-----Original Message-----
>From: Charles McCathieNevile [mailto:charles@w3.org]
>Sent: Saturday, May 01, 2004 12:41 PM
>To: Gregg Vanderheiden
>Cc: w3c-wai-gl@w3.org
>Subject: RE: Definition of human testability
>This seems backwards. Presumably we believe that all tests will produce

>consistent results when done by reasonably knowledgeable people, with 
>some of them also being sufficiently simple to automate completely.
>Otherwise we have no basis for deciding whether a particular test that 
>a tool does is in fact a valid one or not, and in the case of two 
>conflicting results from tools we would not have any way of declaring 
>which was accurate...
>On Thu, 29 Apr 2004, Gregg Vanderheiden wrote:
>>That is what is intended I believe.
>>Your alternative wording #1 is closest.   The word "certain" isn't
>>right since it would apply to all of the non-machine testable items
so it
>>would become
>>1. In the judgment of the working group members, the success criteria 
>>that are not machine testable can be tested by humans in a manner that

>>of yielding consistent results among multiple knowledgeable testers.
>> -- ------------------------------
>>Gregg C Vanderheiden Ph.D.
>>Professor - Ind. Engr. & BioMed Engr.
>>Director - Trace R & D Center
>>University of Wisconsin-Madison
>>  _____
>>From: w3c-wai-gl-request@w3.org [mailto:w3c-wai-gl-request@w3.org] On
>>Of Sailesh Panchang
>>Sent: Thursday, April 29, 2004 11:22 AM
>>To: w3c-wai-gl@w3.org
>>Subject: Definition of human testability
>>    Present draft: "Success criteria for all levels would be testable.

>>Some success criteria may be machine-testable. Others may require 
>>human judgment. Success criteria that require human testing would, in 
>>the judgment of the working group members,  yield consistent results 
>>among multiple knowledgeable testers."
>>Wording of the last sentence is confusing. I believe what is meant is:
>>"Judgment of the working group members" applies to identification of
>>criteria that can be tested with  consistency  and reliability  by
>>Do we intend to list these tests?
>>Consider following alternatives:
>>1. In the judgment of the working group members, certain success
>>can be tested by humans in a manner that is capable of yielding
>>results among multiple knowledgeable testers.
>>2. Claims of conformance  to success criteria can be based on human 
>>testing is such testing has yielded or is capable of yielding 
>>consistent results among multiple knowledgeable testers.
>>Sailesh Panchang
>>Senior Accessibility Engineer
>>Deque Systems,11180  Sunrise Valley Drive,
>>4th Floor, Reston VA 20191
>>Tel: 703-225-0380 Extension 105
>>E-mail: sailesh.panchang@deque.com
>>Fax: 703-225-0387
>>* Look up <http://www.deque.com> *
>Charles McCathieNevile  http://www.w3.org/People/Charles  tel: +61 409 
>134 136
>SWAD-E http://www.w3.org/2001/sw/Europe         fax(france): +33 4 92
38 78
> Post:   21 Mitchell street, FOOTSCRAY Vic 3011, Australia    or
> W3C, 2004 Route des Lucioles, 06902 Sophia Antipolis Cedex, France

Charles McCathieNevile  http://www.w3.org/People/Charles  tel: +61 409
134 136
SWAD-E http://www.w3.org/2001/sw/Europe         fax(france): +33 4 92 38
78 22
 Post:   21 Mitchell street, FOOTSCRAY Vic 3011, Australia    or
 W3C, 2004 Route des Lucioles, 06902 Sophia Antipolis Cedex, France
Received on Monday, 3 May 2004 09:57:37 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 16 January 2018 15:33:49 UTC