Re: Confidence Claims - more discussion from Shadi Abou-Zahra on 2005-06-01 (public-wai-ert@w3.org from June 2005)

From: Shadi Abou-Zahra <shadi@w3.org>
Date: Wed, 01 Jun 2005 22:50:11 +0200
To: Nick Kew <nick@webthing.com>
Cc: public-wai-ert@w3.org
Message-ID: <429E1F83.6050007@w3.org>
Hi,


Nick Kew wrote:
>>>If so, then do we want to continue doing that through the 
>>>validity/confidence pair or do we rather want to introduce 
>>>more granuality for validity (for example earl:ProbablyPasses 
>>>as a subclass of earl:fail)?
>>
>>
>>IMO it could be the good way:
>>
>>Pass --> "pass high"
>>ProblabyPass --> "pass medium"
>>CannotTell --> "pass low" or "fail low"
>>ProblabyFail --> "fail medium"
>>Fail --> "fail high"
> 
> 
> At the per-page level, that's what Valet gives you.  It's based on
> an aggregation of all tests performed on the page and indicating
> a possible fail.
> 
> Within a page, it's different.  The basic principle is, it's asserting
> "This page [definitely|probably|maybe] (passes|fails), and here are the
> details of things that [may] cause it to fail."  The details are the
> results of the individual tests, each of which is associated with the
> appropriate node in the markup.

Sorry, I still don't understand why this is a different use case. Let me summarize:

1. We have a set of atomic tests such as "check if page has <blockquote> element".

2. We have a set of aggregations such as "if tests 1 and 2 pass then CP X passes".

In both cases we will *sometimes* be able to make definitive Pass/Fail answers but in many others we will need to express "passed or failed with higher or lower degree of confidence". Well, isn't "passed with a low confidence" actually a fail if we are really strict? On the other hand, we want to encourage the users and so instead of failing most tests we say "NearlyPasses" or "ProbablyPassed" or similar to express the varying degrees of pass/fail. Is the earl:confidence property being used in another way?


> But here we only need to record where a test has failed.  Take as an
> example, misuse of <blockquote> for indentation.  Since this is very
> widespread, Valet starts with a premise that any <blockquote> may be
> a misuse.  But it also looks for a cite="..." attribute, on the grounds
> that any blockquote with a cite is almost certainly being used
> correctly.  If the blockquote has no cite, it is flagged as a fail
> with medium confidence (the confidence level ascribed is inevitably
> subjective here).  Crucially, if it has a cite, the test is passed,
> and _nothing_ is recorded.  It is not really productive to load a
> report full of tests that were passed!

IMHO, it is bad practice not to record which tests passed. I think this is useful "proof" of what has been tested, how, and why the tool claims certain assertions. As discussed before on the list, such a comprehensive report could also be linked from the tested page so that browsers, search engines, or assisstive technologies could make use of it. Anyhow, this is a separate topic.

Regards,
  Shadi


-- 
Shadi Abou-Zahra,       Web Accessibility Specialist for Europe 
World Wide Web Consortium (W3C),             http://www.w3.org/ 
Web Accessibility Initiative (WAI),      http://www.w3.org/WAI/ 
IST WAI-TIES Project (WAI-TIES)     http://www.w3.org/WAI/TIES/ 
Evaluation and Repair Tools (ERT WG), http://www.w3.org/WAI/ER/ 
2004, Route des Lucioles BP93 - 06560 Sophia-Antipolis - France 
Voice: +33(0)4 92 38 50 64             Fax: +33(0)4 92 38 78 22
Received on Wednesday, 1 June 2005 20:50:16 UTC