RE: Another comment about confidence value. from Paul Walsh on 2005-04-19 (public-wai-ert@w3.org from April 2005)

From: Paul Walsh <paul.walsh@segalamtest.com>
Date: Tue, 19 Apr 2005 10:32:01 +0100
To: "'Nils Ulltveit-Moe'" <nils@u-moe.no>
Cc: "'Charles McCathieNevile'" <charles@sidar.org>, "'Giorgio Brajnik'" <giorgio@dimi.uniud.it>, <public-wai-ert@w3.org>
Message-ID: <000301c544c2$a467fdf0$0200a8c0@PaulLaptop>

-----Original Message-----
From: public-wai-ert-request@w3.org
[mailto:public-wai-ert-request@w3.org] On Behalf Of Nils Ulltveit-Moe
Sent: 19 April 2005 10:16
To: Paul Walsh
Cc: 'Charles McCathieNevile'; 'Giorgio Brajnik'; public-wai-ert@w3.org
Subject: RE: Another comment about confidence value.

I appreciate that. With such a profile your testers would most probably
be quite confident in their decisions, and if you are 100% confident
that an accessibility issue is real, then the extra confidence value is
not needed. (i.e. the default value for confidence, if it is left out,
is 1).
[PW] Every 'validation' company needs to follow the same process
irrespective of experience. That way, the output of the 'team' will be
100 confident in their interpretation of the checkpoint passing or
failing. If they are not, then you have an issue with that company's
capabilities and/or understanding of the checkpoints.

> We use both manual and automated testing methods where the former
> outweighs the latter by a long way. If someone is less than certain
> about the output of their test they will always seek a second opinion
> from their colleagues. This is why it?s absolutely necessary to have a
> team of auditors on any project. Each person?s interpretation of an
> outcome is debated until they come to an agreement. The combined
> interpretation may not be 100% accurate if compared to that of a
> disabled user (or even someone outside the company), but at least they
> are 100% confident in the recorded defect.  Anything less than this is
> not good enough.

Yes, and this describes why you do not need the confidence parameter,
since it defaults to a probability of 1 (or 100%).

We are doing quite different measurements. We will be trying to do
automatic assessments of a large number of sites (several thousand)
regularly. We will need to do some manual testing, and will base our
tests largely on automatic assessments. In our case we need to base
ourself on probability theory and best practices in statistics to reach
numbers that approximate the perceived accessibility over a large number
of assessments, to make it feasible.
[PW] This will not be accurate and I would question the process itself
of using automation for the majority of your validation.

Regards,
-- 
Nils Ulltveit-Moe <nils@u-moe.no>

Received on Tuesday, 19 April 2005 09:32:16 UTC