Re: Re[2]: acceptance criteria for new success criteria

Hi Andrew, Jason,  and All,

I checked some of the comment, minutes, and email archives for "High
Inter-Rater Reliability" and "8 out of 10" or "80%. The following is
what I found. Hope it is helpful.

<quote> John M Slatin, 27 Apr 2005:
"The literature seems to support the 80 per cent figure.  In fact,
inter-rater reliability (percentage of agreement among multiple people
rating the same items) of 80% is considered "adequate," but 85% is
considered better. I think we're safe in using the 80 percent figure.
If we go lower than that it will be difficult to claim reliability." -
https://lists.w3.org/Archives/Public/w3c-wai-gl/2005AprJun/0273.html

27 April 2006 Working Draft:
<quote>
"All WCAG 2.0 success criteria are testable. While some can be tested
by computer programs, others must be tested by qualified human
testers. Sometimes, a combination of computer programs and qualified
human testers may be used. When people who understand WCAG 2.0 test
the same content using the same success criteria, the same results
should be obtained with high inter-rater reliability."
</quote>
https://www.w3.org/TR/2006/WD-WCAG20-20060427/conformance.html

Comment LC-879 from Christophe Strobbe:
<quote>
"Please define or point to criteria for 'high inter-rater reliability'..."

Resolution:
"Inter-rater reliability is a tougher standard than test-retest.

We no longer use this term in WCAG 2.0. Instead, we have revised this
section to say "The same results should be obtained with a high level
of confidence when people who understand how people with different
types of disabilities use the Web test the same content."
</quote>
Source:
https://www.w3.org/2006/02/lc-comments-tracker/35422/wcag20-lc/879

Comment LC-1267 from Andrew Arch:
<quote>
para 4 - "When people who understand WCAG 2.0 test the same content
using the same success criteria, the same results should be obtained
with high inter-rater reliability". More than just an understanding of
WCAG 2.0 is required - these people also need an understaning of how
PWD interact with the web, with or without assistive technologies.

Proposed Change:

add something extra to the qualifications that WCAG 2.0 testers are
required to have to obtain the same results.

Also suggest changing "high inter-rater reliability" to "high
inter-tester reliability"

Resolution:
"The qualifications to be a WCAG 2.0 tester are not formalized, and
the quantification of knowledge skills and abilities to do so is
beyond the scope of this document.  We do agree that a qualified
individual should have background in disability and not just the web.

We have revised the conformance section significantly since the April
2006 working draft. The sentence related to your comment, in
http://www.w3.org/TR/2007/WD-WCAG20-20070517/#overview-sc, now reads:

"The same results should be obtained with a high level of confidence
when people who understand how people with different types of
disabilities use the Web test the same content."
</quote>
Source:
https://www.w3.org/2006/02/lc-comments-tracker/35422/wcag20-lc/1267

More references where 8 out of 10 and 80% is discussed:

https://www.w3.org/WAI/GL/2001/11/29-minutes.html
https://lists.w3.org/Archives/Public/w3c-wai-gl/2001OctDec/0439.html
https://lists.w3.org/Archives/Public/w3c-wai-gl/2001OctDec/0440.html
https://www.w3.org/WAI/GL/2004/04/22-minutes.html
https://www.w3.org/WAI/GL/2003/10/24-minutes.html
https://lists.w3.org/Archives/Public/w3c-wai-gl/2005AprJun/0293.html
https://lists.w3.org/Archives/Public/w3c-wai-gl/2004AprJun/0533.html

Kindest Regards,
Laura

On 6/2/16, White, Jason J <jjwhite@ets.org> wrote:
>
>
> From: Andrew Kirkpatrick [mailto:akirkpat@adobe.com]
>
> Jason,
> Can you show me where the 8 of 10 is documented as official policy of the
> group for the 2.0 document?
>
> I can’t remember, but others might. The acceptance requirements for success
> criteria were agreed upon, and high inter-rater reliability was fundamental
> to them. I’m not sure where “8 out of 10” comes from and I don’t remember
> those number specifically. More important than any numbers is the idea that
> in order to be accepted into the specification, a success criterion must
> have “high inter-rater reliability”, in the best informed judgment of the
> working group. I’m less concerned about any numerical values here.
>
>
> ________________________________
>
> This e-mail and any files transmitted with it may contain privileged or
> confidential information. It is solely for use by the individual for whom it
> is intended, even if addressed incorrectly. If you received this e-mail in
> error, please notify the sender; do not disclose, copy, distribute, or take
> any action in reliance on the contents of this information; and delete it
> from your system. Any other use of this e-mail is prohibited.
>
>
> Thank you for your compliance.
>
> ________________________________
>


-- 
Laura L. Carlson

Received on Thursday, 2 June 2016 18:16:58 UTC