RE: Confidence Claims - more discussion from Carlos Iglesias on 2005-06-01 (public-wai-ert@w3.org from June 2005)

From: Carlos Iglesias <carlos.iglesias@fundacionctic.org>
Date: Wed, 1 Jun 2005 13:58:07 +0200
To: <shadi@w3.org>, <public-wai-ert@w3.org>
Message-ID: <09700B613C4DD84FA9F2FEA521882819611752@ayalga.fundacionctic.org>
Hello all, 


> On today's ERT WG call we had some more discussion about the 
> earl:confidence property and how we should be handling it in 
> EARL. It seems we could agree on the following:
> 
> 1. Even though the confidence claims are more related to test 
> case descriptions than to the results, expressing them as 
> part of the result is still an important aspect. Several 
> tools are already using this property and would like to 
> continue doing so.
>
> 2. earl:confidence is not simply a relay of the respective 
> property in the test case description (testcase:confidence as 
> a pseudo URI), but it is "processed" by the evaluation tool 
> before it is inserted into a report. An example is to 
> override the value in the test case description when a human 
> evaluator executes the test.

I think that confidence is a property of the test result, but it is
derived from the test itself (how the test was made).

Sincerely, we don't use the confidence property because we think that
it's really confuse and it doesn't make too much sense for us. We think
that this property is really subjective, and it depends completely of
the tool or person. 

* If the assertion is made by a tool, what is the confidence level? In
the case of an img without alt it's clearly high, but, what happens when
we are talking about good or bad alt texts? I'm sure that we could
discuss some "real world examples" of alt text and each of us will give
a different confidence value.

* If the assertion is made by a person it will be really a mess. All we
know that different accessibility specialists have different opinions
and interpretations of several WCAG checkpoints, so this will cause a
variety of confidences.

The only case when confidence makes sense for us is when we are talking
about well defined and proved heuristics. In this case the confidence is
really necessary, it's absolutely dependent of the test case
description, but it affects to the test result, so it's close related to
it.


> 3. earl:confidence should be based on a numeric value (such 
> as percentage or interval). The values "high", "medium", and 
> "low" should be mapped to appropriate numeric values but 
> should remain available for describing ordinal values.

As I comment before IMO the best use case of confidence is when we are
talking about heuristics, and in this case a numeric value is required
because a good accuracy is needed. If you have a numeric value, a tool
that interprets earl could "translate" the value to something more human
understandable (like high, medium and low), in this case maybe it could
be necessary to have an "official suggestion" about how to do mapping
between numeric values and h-m-l values.

 
> 4. There will always be differences between tool results, 
> also in earl:confidence. However, more clarity on how to 
> assign confidence values will reduce the gap that is 
> currently causing reduced interoperability of reports between tools.

As I comment previously again, I think there is no way to clarify the
use of confidence because it's "tool dependent" or "human dependent",
but I think that it's use doesn't reduce the interoperability of
reports, if you don't want it, just don't use it (and IMO this is what
is going to happen, because each tool developer will have their own
"confidence policy").

The only value that have the confidence property nowadays (excluding
heuristics) is the confidence that you have in the tool or person that
have made the assertion because it depends of them. A tool or person can
do an assertion with a high level of confidence, but (most times)
another tool or person could do the same assertion with other level of
confidence. Which one is the best?, probably for each person the best is
the assertion of the tool or person in which they trust.


Regards,

CI.
Received on Wednesday, 1 June 2005 11:58:50 UTC