- From: Ian Hickson <ian@hixie.ch>
- Date: Mon, 01 Jul 2002 11:45:01 +0100
- To: Charles McCathieNevile <charles@w3.org>
- Cc: w3c-wai-er-ig <w3c-wai-er-ig@w3.org>
Charles McCathieNevile wrote:
> Hmm. This was similar to somethingthe UAAG group wanted.
>
> I would propose that we have qualitative rather than quantitative ratings.
Why? I personally would much rather both "confidence" and "severity" were
changed to use percentages (or some other scale, e.g. 0.0 .. 1.0). Having an
enumerated set of values artificially limits applications. For example, I need
four severities (100%, 90%, 50%, 0%) and two confidence levels (100%, 0%),
whereas the current spec only allows for 1 severity (100%) and three confidence
levels (~100%, ~66%, ~33%).
Using a numeric scale doesn't reduce interoperability, as applications can
simply "pigeon hole" values into their internal enumerated types. (The problem
of round tripping through systems that use enumerated sets is already present,
since it is most likely that applications will not have exactly matching sets of
severities and confidences.)
For example, I intend to map any Pass values with severity 80%-99% into the
"pass with unrelated errors (Yb)" category when summarising results.
(Note: Internally, I store severities and confidences as integers from 0 to 255.
That is the natural computer equivalent of percentages. I would be quite happy
if EARL used the integer scale 0..255. I would also be fine with EARL using a
floating point scale between two arbitrary values, such as 0.0 and 1.0.)
> For some use cases your Yb - pass with unrelated errors would count as a pass,
> and for some cases it would score as a fail. So we would need to know what
> they are.
Assuming "they" refers to the "unrelated errors" then yes, you would; that's
what the "comments" field is for, presumably.
> The question also arises as to how many kinds of result we should include in
> earl and at what point we should leave people to subclass them for their own
> more detailed uses.
I think you only need:
Pass
Fail
Not Applicable
Not Tested
All the other values I can think of are simply variants of those four result
types with various values for "Severity" and "Confidence".
You don't need "Can't Tell" as that should just be "Pass" or "Fail" with
"Confidence: 0%". (Whether you pick "pass" or "fail" depends on which is the
"default" -- e.g. in a test where supporting the feature correctly and not even
trying to support the feature are indistinguishable, you would use "Pass", while
in a test where trying to support the feature but doing so _incorrectly_ is
indistinguishable from not supporting the feature at all you would use "Fail".)
Note that "Not Tested" is present only for completeness, as I expect most
applications would simply not include the result in that case.
"Not Applicable" is important for tests that neither pass nor fail, such as a
test for making sure all images have alternate text, when applied to a document
with no images, or a test to make sure that 'red' is rendered different from
'green', on a monochromatic device.
--
Ian Hickson )\._.,--....,'``. fL
"meow" /, _.. \ _\ ;`._ ,.
http://index.hixie.ch/ `._.-(,_..'--(,_..'`-.;.'
Received on Monday, 1 July 2002 06:45:04 UTC