- From: Nils Ulltveit-Moe <nils@u-moe.no>
- Date: Wed, 13 Apr 2005 21:09:16 +0200
- To: shadi@w3.org, public-wai-ert@w3.org
- Cc: 'Charles McCathieNevile' <charles@sidar.org>, public-wai-ert@w3.org
Hi Shadi, I agree that a confidence interval is probably not interesting reading for the most end users. However, it may be interesting for statisticians, to know how well you can trust the result of the test. It is always possible to choose between three values - Pass, Fail or notApplicable, however how confident you are that a Yes is indeed a Yes may vary. Most humans are not able to specify exactly how confident they are that an accessibility claim is indeed a real problem for accessibility. They may be able to give a rough indication, though. For accessibility claims that are weak one person would say that this test is Pass with 50% confidence, and one would say Fail with 50% confidence. Both may be right from their point of view, and there is no correct answer. Adding the confidence in this case indicates to statisticians that this value should not be weighted as much as an indicator that the assertor strongly believes is an accessibility issue. Heuristic algorithms would work in a similar way. What I am trying to say, is that it is not always possible to abstract oneself away from uncertainties that are inherent in the problem or testcase that is being investigating, and in such cases it is better to have a model that includes the uncertainty than pretending that the uncertainty is not there. >From what I have discussed here, I am actually getting more convinced that a confidence interval or similar is useful, and not only for automatic tests. If confidence interval was used also for manual tests then one would be able to get feedback on how real people perceived different accessibility problems to be, which in turn could be used by W3C to improve the WCAG checklists. I am not so afraid of EARL becoming a more complex protocol. After all, it is intended to be machine readable and not directly consumed by humans. If different degrees of complexity in the protocol are to be allowed, that should be indicated in whether parameters are mandatory or not. RDF is a gracious protocol to work with in that sense, because EARL producing tools must implement the mandatory part of EARL, and may implement optional features, if applicable. It is easy for EARL consuming tools to ignore the parameters they do not understand, because all RDF aware tools are able to traverse the RDF graph, and pick the parts they understand. The protocol should be designed to be sufficiently complex, but not bloated. That may also be helpful in finding extended use for the protocol in other areas than accessibility testing. My conclusion is that maybe the confidence interval should not be mandatory, but I think it should be optional in EARL. And it should be modelled as a probability; i.e. an integer between 0 and 1. Mvh. Nils Ulltveit-Moe ons, 13,.04.2005 kl. 17.36 +0200, skrev Shadi Abou-Zahra: > Hi, > > Frankly, I see havoc and confusion upon thy users. :) > > We are talking about cascades of test cases, assertors, and possibly subjects too. Complex but useful. However, several results? Which one should a tool that is processing tools pick? > > It seems to me that it may be a better approach to rework the model for deriving/communicating the confidence level and keep one unambiguous result per assertion. > > Cheers, > Shadi > > > -----Original Message----- > From: public-wai-ert-request@w3.org On Behalf Of Charles McCathieNevile > Sent: Wednesday, April 13, 2005 16:34 > To: public-wai-ert@w3.org > Subject: result type="foo", confidence, ... > > > > Hi folks, > > in the current EARL spec there are results which look like the following: > > <earl:result rdf:parseType="Resource"> > <earl:validity rdf:resource="&earl;fail"/> > <earl:confidence rdf:resource="&earl;high"/> > <earl:message>malformed element in line 23</earl:message> > </earl:result> > > This makes it possible to put two result on the same Assertion - for > example to assert that they have a different probability, or the assertor > has a different level of conidence in them. > > <earl:result rdf:parseType="Resource"> > <earl:validity rdf:resource="&earl;notApplicable"/> > <earl:confidence rdf:resource="&earl;low"/> > <earl:message>malformed element in line 23</earl:message> > </earl:result> > > I am not sure if we want to maintain this possibility, but it provides a > feasible explanation of what I was copying when I wrote up my examples for > "EARL by example" [1], and it is how Hera currently produces EARL. > > Any thoughts? > > cheers > > Chaals > > [1] http://www.w3.org/2001/sw/Europe/talks/200311-earl/all > -- Nils Ulltveit-Moe <nils@u-moe.no>
Received on Wednesday, 13 April 2005 19:05:25 UTC