- From: John M Slatin <john_slatin@austin.utexas.edu>
- Date: Wed, 27 Apr 2005 14:03:37 -0500
- To: "John M Slatin" <john_slatin@austin.utexas.edu>, <w3c-wai-gl@w3.org>
- Message-ID: <6EED8F7006A883459D4818686BCE3B3B7AE322@MAIL01.austin.utexas.edu>
Correction: I wrote: <blockquote> The literature seems to support the 80 per cent figure. In fact, inter-rater reliability (percentage of agreement among multiple people rating the same items) is considered "adequate," but 85% is considered better. I think we're safe in using the 80 percent figure. If we go lower than that it will be difficult to claim reliability. </blockquote> Sorry, that should have been: <corrected> The literature seems to support the 80 per cent figure. In fact, inter-rater reliability (percentage of agreement among multiple people rating the same items) of 80% is considered "adequate," but 85% is considered better. I think we're safe in using the 80 percent figure. If we go lower than that it will be difficult to claim reliability. </corrected> "Good design is accessible design." John Slatin, Ph.D. Director, Accessibility Institute University of Texas at Austin FAC 248C 1 University Station G9600 Austin, TX 78712 ph 512-495-4288, f 512-495-4524 email jslatin@mail.utexas.edu web http://www.utexas.edu/research/accessibility/ <http://www.utexas.edu/research/accessibility/> -----Original Message----- From: w3c-wai-gl-request@w3.org [mailto:w3c-wai-gl-request@w3.org] On Behalf Of John M Slatin Sent: Wednesday, April 27, 2005 1:54 pm To: w3c-wai-gl@w3.org Subject: [Techs] Definition of "Reliably human testable" On the Techniques call today we discussed the proposed definition of the term "reliably human testable": <proposed> [Definition: Reliably Human Testable: The technique can be tested by human inspection and it is believed that at least 80% of knowledgeable human evaluators would agree on the conclusion. Tests done by people who understand the guidelines should get the same results testing the same content for the same success criteria. The use of probabilistic machine algorithms may facilitate the human testing process but this does not make it machine testable.] </proposed> Someone on the call asked whether the 80 percent figure represented an arbitrary number. I took an action item to find out and report back. With terrific help from David Macdonald, I've got an answer: The literature seems to support the 80 per cent figure. In fact, inter-rater reliability (percentage of agreement among multiple people rating the same items) is considered "adequate," but 85% is considered better. I think we're safe in using the 80 percent figure. If we go lower than that it will be difficult to claim reliability. John "Good design is accessible design." John Slatin, Ph.D. Director, Accessibility Institute University of Texas at Austin FAC 248C 1 University Station G9600 Austin, TX 78712 ph 512-495-4288, f 512-495-4524 email jslatin@mail.utexas.edu web http://www.utexas.edu/research/accessibility/ <http://www.utexas.edu/research/accessibility/>
Received on Wednesday, 27 April 2005 19:03:43 UTC