- From: Charles McCathieNevile <chaals@opera.com>
- Date: Wed, 27 Jul 2005 17:18:14 +0200
- To: "Giorgio Brajnik" <giorgio@dimi.uniud.it>
- Cc: public-wai-ert@w3.org
On Mon, 18 Jul 2005 17:36:34 +0200, Giorgio Brajnik
<giorgio.brajnik@gmail.com> wrote:
> And would like to add simply that even choosing Low/Med/Hi requires
> that you devise a method for assigning those values to test outcomes;
> and also these values will depend on the tests themselves. I don't
> think that hiding the complexity under the Low/Med/Hi hood is
> compatible with the development of a sound model.
Agreed.
> Benchmarking tools using public test collections is one way to go, as
> is the use of private collections, as is the use of a sample of live
> website, in my opinion.
I think there are a number of different ways to choose confidence. Some
will depend on the test being done, some others can be applied to a range
of tests.
For example, Sidar might take Hera, and run its automated testing over
1000 pages. We then ask a panel of 10 people to manually test the same
points. We express the confidence as a number from 1 to 10, based on how
many people agreed with Hera for a given test.
Or you take the type of analysis Nils talks about. Or look at teh work
Giorgio has published in this area in the past. Or you have Nick and
Chris' work (producing high/medium/low, as a rule).
It seems to me that the current H/M/L are not well-enough defined to
provide any interoperability in general, and probably not even in
particular cases.
I suggest, again, that we say that confidence values should include an
identifier for how they were derived. Typically I think this is best
achieved by a datatype, although it can also be done by people creating
their own instance classes. The datatype approach has the benefit that
brain-dead processors can just ignore the datatype and match
high/medium/low, or a numerical scale (assuming it is the same size) on
the basis that at a simplistic level that is probably useful, while
extremely smart processes can find a service that converts results from
one datatype to another according to some well defined rule.
And that we deprecate the EARL instance classes for high/medium/low
cheers
Chaals
--
Charles McCathieNevile chaals@opera.com
hablo español - je parle français - jeg lærer norsk
Here's one we prepared earlier: http://www.opera.com/download
Received on Wednesday, 27 July 2005 15:18:49 UTC