- From: Laura Carlson <laura.lee.carlson@gmail.com>
- Date: Thu, 2 Jun 2016 13:16:24 -0500
- To: Andrew Kirkpatrick <akirkpat@adobe.com>, "White, Jason J" <jjwhite@ets.org>
- Cc: David MacDonald <david100@sympatico.ca>, "josh@interaccess.ie" <josh@interaccess.ie>, "lisa.seeman" <lisa.seeman@zoho.com>, "Patrick H. Lauke" <redux@splintered.co.uk>, "w3c-wai-gl@w3.org" <w3c-wai-gl@w3.org>
Hi Andrew, Jason, and All, I checked some of the comment, minutes, and email archives for "High Inter-Rater Reliability" and "8 out of 10" or "80%. The following is what I found. Hope it is helpful. <quote> John M Slatin, 27 Apr 2005: "The literature seems to support the 80 per cent figure. In fact, inter-rater reliability (percentage of agreement among multiple people rating the same items) of 80% is considered "adequate," but 85% is considered better. I think we're safe in using the 80 percent figure. If we go lower than that it will be difficult to claim reliability." - https://lists.w3.org/Archives/Public/w3c-wai-gl/2005AprJun/0273.html 27 April 2006 Working Draft: <quote> "All WCAG 2.0 success criteria are testable. While some can be tested by computer programs, others must be tested by qualified human testers. Sometimes, a combination of computer programs and qualified human testers may be used. When people who understand WCAG 2.0 test the same content using the same success criteria, the same results should be obtained with high inter-rater reliability." </quote> https://www.w3.org/TR/2006/WD-WCAG20-20060427/conformance.html Comment LC-879 from Christophe Strobbe: <quote> "Please define or point to criteria for 'high inter-rater reliability'..." Resolution: "Inter-rater reliability is a tougher standard than test-retest. We no longer use this term in WCAG 2.0. Instead, we have revised this section to say "The same results should be obtained with a high level of confidence when people who understand how people with different types of disabilities use the Web test the same content." </quote> Source: https://www.w3.org/2006/02/lc-comments-tracker/35422/wcag20-lc/879 Comment LC-1267 from Andrew Arch: <quote> para 4 - "When people who understand WCAG 2.0 test the same content using the same success criteria, the same results should be obtained with high inter-rater reliability". More than just an understanding of WCAG 2.0 is required - these people also need an understaning of how PWD interact with the web, with or without assistive technologies. Proposed Change: add something extra to the qualifications that WCAG 2.0 testers are required to have to obtain the same results. Also suggest changing "high inter-rater reliability" to "high inter-tester reliability" Resolution: "The qualifications to be a WCAG 2.0 tester are not formalized, and the quantification of knowledge skills and abilities to do so is beyond the scope of this document. We do agree that a qualified individual should have background in disability and not just the web. We have revised the conformance section significantly since the April 2006 working draft. The sentence related to your comment, in http://www.w3.org/TR/2007/WD-WCAG20-20070517/#overview-sc, now reads: "The same results should be obtained with a high level of confidence when people who understand how people with different types of disabilities use the Web test the same content." </quote> Source: https://www.w3.org/2006/02/lc-comments-tracker/35422/wcag20-lc/1267 More references where 8 out of 10 and 80% is discussed: https://www.w3.org/WAI/GL/2001/11/29-minutes.html https://lists.w3.org/Archives/Public/w3c-wai-gl/2001OctDec/0439.html https://lists.w3.org/Archives/Public/w3c-wai-gl/2001OctDec/0440.html https://www.w3.org/WAI/GL/2004/04/22-minutes.html https://www.w3.org/WAI/GL/2003/10/24-minutes.html https://lists.w3.org/Archives/Public/w3c-wai-gl/2005AprJun/0293.html https://lists.w3.org/Archives/Public/w3c-wai-gl/2004AprJun/0533.html Kindest Regards, Laura On 6/2/16, White, Jason J <jjwhite@ets.org> wrote: > > > From: Andrew Kirkpatrick [mailto:akirkpat@adobe.com] > > Jason, > Can you show me where the 8 of 10 is documented as official policy of the > group for the 2.0 document? > > I can’t remember, but others might. The acceptance requirements for success > criteria were agreed upon, and high inter-rater reliability was fundamental > to them. I’m not sure where “8 out of 10” comes from and I don’t remember > those number specifically. More important than any numbers is the idea that > in order to be accepted into the specification, a success criterion must > have “high inter-rater reliability”, in the best informed judgment of the > working group. I’m less concerned about any numerical values here. > > > ________________________________ > > This e-mail and any files transmitted with it may contain privileged or > confidential information. It is solely for use by the individual for whom it > is intended, even if addressed incorrectly. If you received this e-mail in > error, please notify the sender; do not disclose, copy, distribute, or take > any action in reliance on the contents of this information; and delete it > from your system. Any other use of this e-mail is prohibited. > > > Thank you for your compliance. > > ________________________________ > -- Laura L. Carlson
Received on Thursday, 2 June 2016 18:16:58 UTC