Re: Confidences in accessibilty evaluation from Charles McCathieNevile on 2005-07-27 (public-wai-ert@w3.org from July 2005)

From: Charles McCathieNevile <chaals@opera.com>
Date: Wed, 27 Jul 2005 17:18:14 +0200
To: "Giorgio Brajnik" <giorgio@dimi.uniud.it>
Cc: public-wai-ert@w3.org
Message-ID: <op.sukvgooswxe0ny@pc099.qadoc.oslo.opera.com>

On Mon, 18 Jul 2005 17:36:34 +0200, Giorgio Brajnik  
<giorgio.brajnik@gmail.com> wrote:

> And would like to add simply that even choosing Low/Med/Hi requires
> that you devise a method for assigning those values to test outcomes;
> and also these values will depend on the tests themselves. I don't
> think that hiding the complexity under the Low/Med/Hi hood is
> compatible with the development of a sound model.

Agreed.

> Benchmarking tools using public test collections is one way to go, as
> is the use of private collections, as is the use of a sample of live
> website, in my opinion.

I think there are a number of different ways to choose confidence. Some  
will depend on the test being done, some others can be applied to a range  
of tests.

For example, Sidar might take Hera, and run its automated testing over  
1000 pages. We then ask a  panel of 10 people to manually test the same  
points. We express the confidence as a number from 1 to 10, based on how  
many people agreed with Hera for a given test.

Or you take the type of analysis Nils talks about. Or look at teh work  
Giorgio has published in this area in the past. Or you have Nick and  
Chris' work (producing high/medium/low, as a rule).

It seems to me that the current H/M/L are not well-enough defined to  
provide any interoperability in general, and probably not even in  
particular cases.

I suggest, again, that we say that confidence values should include an  
identifier for how they were derived. Typically I think this is best  
achieved by a datatype, although it can also be done by people creating  
their own instance classes. The datatype approach has the benefit that  
brain-dead processors can just ignore the datatype and match  
high/medium/low, or a numerical scale (assuming it is the same size) on  
the basis that at a simplistic level that is probably useful, while  
extremely smart processes can find a service that converts results from  
one datatype to another according to some well defined rule.

And that we deprecate the EARL instance classes for high/medium/low

cheers

Chaals

-- 
Charles McCathieNevile                              chaals@opera.com
          hablo español - je parle français - jeg lærer norsk
   Here's one we prepared earlier:   http://www.opera.com/download

Received on Wednesday, 27 July 2005 15:18:49 UTC