- From: Charles McCathieNevile <chaals@opera.com>
- Date: Wed, 27 Jul 2005 17:18:14 +0200
- To: "Giorgio Brajnik" <giorgio@dimi.uniud.it>
- Cc: public-wai-ert@w3.org
On Mon, 18 Jul 2005 17:36:34 +0200, Giorgio Brajnik <giorgio.brajnik@gmail.com> wrote: > And would like to add simply that even choosing Low/Med/Hi requires > that you devise a method for assigning those values to test outcomes; > and also these values will depend on the tests themselves. I don't > think that hiding the complexity under the Low/Med/Hi hood is > compatible with the development of a sound model. Agreed. > Benchmarking tools using public test collections is one way to go, as > is the use of private collections, as is the use of a sample of live > website, in my opinion. I think there are a number of different ways to choose confidence. Some will depend on the test being done, some others can be applied to a range of tests. For example, Sidar might take Hera, and run its automated testing over 1000 pages. We then ask a panel of 10 people to manually test the same points. We express the confidence as a number from 1 to 10, based on how many people agreed with Hera for a given test. Or you take the type of analysis Nils talks about. Or look at teh work Giorgio has published in this area in the past. Or you have Nick and Chris' work (producing high/medium/low, as a rule). It seems to me that the current H/M/L are not well-enough defined to provide any interoperability in general, and probably not even in particular cases. I suggest, again, that we say that confidence values should include an identifier for how they were derived. Typically I think this is best achieved by a datatype, although it can also be done by people creating their own instance classes. The datatype approach has the benefit that brain-dead processors can just ignore the datatype and match high/medium/low, or a numerical scale (assuming it is the same size) on the basis that at a simplistic level that is probably useful, while extremely smart processes can find a service that converts results from one datatype to another according to some well defined rule. And that we deprecate the EARL instance classes for high/medium/low cheers Chaals -- Charles McCathieNevile chaals@opera.com hablo español - je parle français - jeg lærer norsk Here's one we prepared earlier: http://www.opera.com/download
Received on Wednesday, 27 July 2005 15:18:49 UTC