Re: Confidence claims in EARL assertions

Hi Shadi,

ons, 25,.05.2005 kl. 12.07 +0200, skrev Shadi Abou-Zahra:
> * The Problem
> Due to a lack of precise definition of how to calculate the low, medium, or high values, 
> this property was not implemented by several tools and if implemented then usually inconsistently.

In our use case, low medium and high would probably have limited value,
since it is a too course definition.

> 
> * Requirements
> The following are two scenarios that represent cases in which it makes sense to be able to express 
> confidence in a result (but not necessarily using the current property definition).
> 
>  - Without Probability Value
> An evaluation tool allows users to define "word lists" to detect bad alt text. For example, and alt text 
> that is "Insert description of image" initiates a fail result automatically. However, this is not a 
> guarantee, it may be in fact an appropriate text. It is not possible to calculate a probability value 
> for the success of the test in such cases.

In principle, you could find a probability for a barrier as the number
of results that caused a barrier divided by the total number of alt
texts taken from a sufficiently large sample. You could not know in
advance what the result would be for a given test.

However there are other reasons not to go down this route, since this
probability would only be valid in the given language, and it may not be
trivial to specify a translation. Accessibility assessments should be
language independant if possible. Also it is not viable to calculate
these probabilties for all possible "bad text" cases...

>  - Probability Value Exists
> An algorithm can detect layout tables with a precision of 85%. This means for WCAG 1.0 CP 5.1 
> (row and column headers), the results are 85% probably accurate. The probability of 85% has been 
> calculated by the vendor using test files and comparing to real Web sites.

This could then again be combined with the probability for a barrier
given that the outcome was not a layout table, which can be found in the
same way as the alt example above.

> * Approaches
> There seem to be three main approaches to address confidence claims, I may be missing other solutions.
> 
>  - Support Probability Value
> Much of the previous discussion seems to be supportive of adding a mechanism to express probability 
> values (aka intervals etc) when they exist. This is only applicable to some tests.

To me it seems that probability is involved to a quite large extent,
however there are cases where a barrier is absolute, and the disabled
user is not able to fulfill his task on the website due to this
accessibility problem. 

>  - Extend Validity Values
> For test that are know to be applicable and are conducted on a subject, only three further values 
> remain for the earl:validity property: earl:Pass, earl:Fail, or earl:CannotTell. It may be necessary 
> to extend these values to express different levels of Pass or Fail. Some tools already do that.

Tools that need to do this can subclass the earl:validity, so that is fine.

>  - Remove Confidence Claims
> To some extent, "confidence" belongs more to the test than the result. We can drop confidence claims 
> from EARL and leave it to a description language for the tests/test procedures. There are certainly 
> cases that are out of scope, for example what is the confidence level of manual results (ie confidence 
> in an evaluator)?

I agree on this. Maybe the confidence claims should rather be in a
description language for the tests, since they will be dependant on a
large number of tests, and may need verification by doing e.g. user
testing, which would have to be repeated regularely to catch changes in
the confidence level. 

Mvh.
-- 
Nils Ulltveit-Moe <nils@u-moe.no>

Received on Wednesday, 25 May 2005 12:58:53 UTC