Re: Confidence Claims - more discussion

Hi Nick,

Nick Kew wrote:
>>1. Even though the confidence claims are more related to test case
>>descriptions than to the results, expressing them as part of the result
>>is still an important aspect. Several tools are already using this
>>property and would like to continue doing so.
> 
> 
> Indeed.
> 
> Whether confidence is a property of the test or the result is up to the
> tool.  For example, take the classic case of ALT attributes.
> 
> Tool 1 has one test for alt attributes.  It reports a violation with
> high confidence if there is no ALT, or if the ALT is from a list of
> bogosity-detector keywords like "bullet" or "spacer".  It reports
> with a lower confidence if the ALT contains one of those words in
> a phrase (parsing "small red bullet" vs "inserting the bullet"
> is a bit too ambitious), or if the ALT ends with a suspicious
> string like ".gif" or ".jpe?g".  In this tool, the confidence is
> a property of the result.
> 
> Tool 2 has a series of different tests for alts.  Overall it tests
> the same things as Tool 1, but each test has only a single yes/no
> result.  Here confidence is a property of the test.  The test that
> flags "bullet" has a higher confidence than the test that flags
> "small red bullet".  But although confidence is defined as a
> property of the test, it can also be expressed as a property
> of the result.
> 
> My own Valet tool works with small and simple tests, as described
> for Tool 2.  I'm not sure where Chris's tool fits.  But we should
> be able to accommodate either case in EARL.  Making confidence a
> property of the result works for both cases; making it a property
> of testcase could be problematic for Tool 1.

So, could one say that the currently employed use case of the earl:confidence values is to express different levels of the pass/fail values? For example, in order to express the following:

"This test failed for sure" -> validity=fail, confidence=high
"This test probably passes" -> validity=pass, confidence=medium
"This test may not be applicable" -> validity=NA, confidence=low

If so, then do we want to continue doing that through the validity/confidence pair or do we rather want to introduce more granuality for validity (for example earl:ProbablyPasses as a subclass of earl:fail)?


Regards,
  Shadi


-- 
Shadi Abou-Zahra,       Web Accessibility Specialist for Europe 
World Wide Web Consortium (W3C),             http://www.w3.org/ 
Web Accessibility Initiative (WAI),      http://www.w3.org/WAI/ 
IST WAI-TIES Project (WAI-TIES)     http://www.w3.org/WAI/TIES/ 
Evaluation and Repair Tools (ERT WG), http://www.w3.org/WAI/ER/ 
2004, Route des Lucioles BP93 - 06560 Sophia-Antipolis - France 
Voice: +33(0)4 92 38 50 64             Fax: +33(0)4 92 38 78 22 

Received on Wednesday, 1 June 2005 09:29:18 UTC