Re: EARL Guideline Pass/Fail Confidence from Charles McCathieNevile on 2004-01-28 (w3c-wai-er-ig@w3.org from January 2004)

From: Charles McCathieNevile <charles@w3.org>
Date: Wed, 28 Jan 2004 04:08:16 -0500 (EST)
To: Chris Ridpath <chris.ridpath@utoronto.ca>
Cc: WAI ER IG List <w3c-wai-er-ig@w3.org>
Message-ID: <Pine.LNX.4.55.0401280358490.12367@homer.w3.org>
Hi Chris,

Shadi and I talked about this a bit yesterday.

The conclusion I think we came to was that the tool should generate results
for the individual tests that it can do, and for the rest (or for any
aggregate) it should produce a result of "cannotTell" or "notTested" as
appropriate.

The problem is that it is a bit dubious to say "well, we don't imagine you'd
do anything but what we would do, so we'll assume problems are unlikely" in
individual use cases. If somebody does do something unusual then their
results, while admitteedly divergent from the statistical average, become
entirely unreliable. And we have no way of knowing if that is the case, which
calls into question any individual result.

The model we discussed (it isn't new - it was discussed, for example, in
Bristol in 2002) was based on how to use the "heuristic" mode. If you have a
bunch of individual results, you should be able to say that you have derived
a new result, according to a rule. For example, there are several different
sets of tests around for conformance to WCAG 1.1 - some are proprietary, some
are open, some have common features and others may not.

This is essentially what it takes to generate a result like WCAG double-A
conformance anyway, especially if you want to merge results from different
sources.

There was also some discussion about how the model should work - the schema
that was published is invalid, but I think enough of it can be understood to
make it work. As I noted recently there is a bug in my intro to EARL stuff -
I hope to fix that later today. It has to do with the schema being more
complex than it needs to as far as I can tell.

Cheers

Chaals

On Tue, 27 Jan 2004, Chris Ridpath wrote:

>Yes, the confidence level was not quite right for what I was trying to
>describe.
>
>Your suggestion of creating a new class to define partial pass is much
>better. Thanks!
>
>Here's what I'm thinking of using for a 'conditional pass' (just a slight
>mod of your Partial):
>
><rdfs:Class
>rdf:about=http://www.utoronto.ca/rdf-def/mas-earl#Conditional-Pass>
>   <rdfs:subClassOf
>rdf:resource="http://www.w3.org/WAI/ER/EARL/nmg-strawman#Fail"/>
>   <rdfs:isDefinedBy
>rdf:resource="http://www.utoronto.ca/rdf-def/mas-earl#"/>
>   <rdfs:label xml:lang="en">Only passes known problems. Still has potential
>problems</rdfs:label>
>   <rdfs:comment xml:lang="en">An extension used by the ATRC accessibility
>checker to identify partially conforming content</rdfs:comment>
></rdfs:Class>
>
>We're using the term 'known problem' to describe an accessibility problem
>that  a software tool can detect. A 'potential problem' is something that a
>software tool can not detect - it requires human intervention. These terms
>may mean different things to different people and we're looking at other
>terminology.
>
>The accessibility barriers posed by the 'potential problems' are as great as
>'known problems' so the term 'potential' does not imply that the problem is
>any less severe.
>
>We're still mulling over this idea of 'conditional pass'. If a file passes
>all the checks for known problems then it likely will be OK. Potential
>problems, like missing LONGDESC, are not likely to happen.
>
>Chris
>
>
>----- Original Message -----
>From: "Charles McCathieNevile" <charles@w3.org>
>To: "Chris Ridpath" <chris.ridpath@utoronto.ca>
>Cc: "WAI ER IG List" <w3c-wai-er-ig@w3.org>
>Sent: Monday, January 26, 2004 7:33 AM
>Subject: Re: EARL Guideline Pass/Fail Confidence
>
>
>>
>> Hi,
>>
>> I don't think confidence is a particularly accurate measure. In some cases
>we
>> are saying "cannotTell" but adding a suspicion, in some cases we are
>almost
>> certain (some people seem always to be certain :-).
>>
>> I have no objection to people using a confidence scale, but I suspect that
>we
>> should look at the use cases and whether we  can say something else more
>> useful.
>>
>> (See also my recent email about the bug in my intro re using rdf:resource
>> when it should be rdf:type or something even more complex...)
>>
>> Cheers
>>
>> Chaals
>>
>> On Thu, 15 Jan 2004, Chris Ridpath wrote:
>>
>
>

Charles McCathieNevile  http://www.w3.org/People/Charles  tel: +61 409 134 136
SWAD-E http://www.w3.org/2001/sw/Europe         fax(france): +33 4 92 38 78 22
 Post:   21 Mitchell street, FOOTSCRAY Vic 3011, Australia    or
 W3C, 2004 Route des Lucioles, 06902 Sophia Antipolis Cedex, France
Received on Wednesday, 28 January 2004 04:08:16 UTC