Confidence claims in EARL assertions from Shadi Abou-Zahra on 2005-05-25 (public-wai-ert@w3.org from May 2005)

From: Shadi Abou-Zahra <shadi@w3.org>
Date: Wed, 25 May 2005 12:07:53 +0200
To: public-wai-ert@w3.org
Message-ID: <42944E79.9040709@w3.org>

Dear Group,

There have been several discussions about confidence claims in EARL and the most recent yesterday on the call. Please find a summary of the problem and some of approaches that have been discussed previously, I hope this may focus the discussion and lead to a resolution soon:

* Current Schema
Currently EARL defines an earl:confidence property that can be one of earl:low, earl:medium, or earl:high. The rationale for this approach was to introduce some form of certainty to say "this test passes for sure, while that one is probably more a pass than a fail".

* The Problem
Due to a lack of precise definition of how to calculate the low, medium, or high values, this property was not implemented by several tools and if implemented then usually inconsistently.

* Requirements
The following are two scenarios that represent cases in which it makes sense to be able to express confidence in a result (but not necessarily using the current property definition).

- Without Probability Value
An evaluation tool allows users to define "word lists" to detect bad alt text. For example, and alt text that is "Insert description of image" initiates a fail result automatically. However, this is not a guarantee, it may be in fact an appropriate text. It is not possible to calculate a probability value for the success of the test in such cases.

- Probability Value Exists
An algorithm can detect layout tables with a precision of 85%. This means for WCAG 1.0 CP 5.1 (row and column headers), the results are 85% probably accurate. The probability of 85% has been calculated by the vendor using test files and comparing to real Web sites.

* Approaches
There seem to be three main approaches to address confidence claims, I may be missing other solutions.

- Support Probability Value
Much of the previous discussion seems to be supportive of adding a mechanism to express probability values (aka intervals etc) when they exist. This is only applicable to some tests.

- Extend Validity Values
For test that are know to be applicable and are conducted on a subject, only three further values remain for the earl:validity property: earl:Pass, earl:Fail, or earl:CannotTell. It may be necessary to extend these values to express different levels of Pass or Fail. Some tools already do that.

- Remove Confidence Claims
To some extent, "confidence" belongs more to the test than the result. We can drop confidence claims from EARL and leave it to a description language for the tests/test procedures. There are certainly cases that are out of scope, for example what is the confidence level of manual results (ie confidence in an evaluator)?

Looking forward to your reactions and comments.

Regards,
Shadi

--
Shadi Abou-Zahra, Web Accessibility Specialist for Europe
World Wide Web Consortium (W3C), http://www.w3.org/
Web Accessibility Initiative (WAI), http://www.w3.org/WAI/
IST WAI-TIES Project (WAI-TIES) http://www.w3.org/WAI/TIES/
Evaluation and Repair Tools (ERT WG), http://www.w3.org/WAI/ER/
2004, Route des Lucioles BP93 - 06560 Sophia-Antipolis - France
Voice: +33(0)4 92 38 50 64 Fax: +33(0)4 92 38 78 22

Received on Wednesday, 25 May 2005 10:07:48 UTC