Re: AW: AW: evaluating web applications (was Re: Canadian Treasury Board accessibility assessment methodology)

Hi Kerstin,

As expressed in the paper, the statistics function has only recently been added. So at the moment, this is an informal assessment which we will need to back up once we have more data. 

But this is what we hope to get out of the stats function:

1. Tester reliability over time: How much are individual evaluators 'off the mark' compared to the final quality-assured result? This could show an improvement over time, an interesting metrics to assess the level of qualification especially of new and less experienced evaluators.

2. Inter-evaluator reliability: How close are the results of different evaluators assessing the same site / page sample?

There is likely to be little on test-retest reliability data since usually, the sites tested are a moving target - improved based on test results. Only rarely the same site is re-tested in a tandem test - this usually only happens after a re-launch.

A fundamental problem in all those statistics is that there is no objective benchmark to compare individual rating results against - just the arbitrated and quality assured final evaluation result. Given the scope of interpretation in accessibility evaluation, we think this lack of objectivity is inevitable and in the end, down to the complexity of the field under investigation and the degree of human error in all evaluation.


--
Detlev Fischer
testkreis c/o feld.wald.wiese
Borselstraße 3-7 (im Hof), 22765 Hamburg

Mobil +49 (0)1577 170 73 84
Tel +49 (0)40 439 10 68-3
Fax +49 (0)40 439 10 68-5

http://www.testkreis.de
Beratung, Tests und Schulungen für barrierefreie Websites



----- Original Message -----
From: k.probiesch@googlemail.com
To: detlev.fischer@testkreis.de, peter.korn@oracle.com, shadi@w3.org
Date: 23.05.2012 10:09:44
Subject: AW: AW: evaluating web applications (was Re: Canadian Treasury Board accessibility assessment methodology)


> Hi Detlev,
> 
> in the mentioned paper for the Website Accessibility Metrics Online Symposium is written: " Our experience shows that the 5 point graded rating scale is quite reliable." I think it would be helpful for the discussion to know what "quite reliable" exactly means (the value for the reliability coefficient).
> 
> Best
> 
> Kerstin
> 
>> -----Ursprüngliche Nachricht-----
>> Von: detlev.fischer@testkreis.de [mailto:detlev.fischer@testkreis.de]
>> Gesendet: Mittwoch, 23. Mai 2012 09:57
>> An: k.probiesch@googlemail.com; peter.korn@oracle.com; shadi@w3.org
>> Cc: public-wai-evaltf@w3.org
>> Betreff: Re: AW: evaluating web applications (was Re: Canadian Treasury
>> Board accessibility assessment methodology)
>> 
>> Hi all,
>> 
>> Perhaps not surprisingly for those who have followed these discussions
>> since summer last year, I disagree with Kerstin's statement "the more
>> granualar the evaluation, the less reliable it is".
>> 
>> The binary approach produces artefacts because it often forces
>> evalutors to be either too strict (failing a SC due to minor issues) or
>> too lenient (attesting conformance in spite of such issues).
>> 
>> We've tried to show the higher fidelity of a graded evaluation approch
>> in our recent paper for the Website Accessibility Metrics Online
>> Symposium 5 December 2011:
>> 
>> http://www.w3.org/WAI/RD/2011/metrics/paper7/
>> 
>> 
>> > Hi Peter, Shadi,
>> >
>> > if we would work out "something that is different" from the pass/fail
>> which
>> > obviously is not compliant with the conformance requirements it
>> wouldn't be
>> > an evaluation methodology for WCAG 2.0 anymore. Of course: part of
>> reality
>> > is imperfect software. Part of reality are also "imperfect"
>> developers and
>> > "imperfect" online editors. The question for me is: if we consider
>> these
>> > aspects why then promote for example ATAG? Another problem for me is:
>> the
>> > more granular evaluations are the less reliable they will be.
>> >
>> > Regards
>> >
>> > Kerstin
>> >
>> >
>> >
>> > Von: Peter Korn [mailto:peter.korn@oracle.com]
>> > Gesendet: Dienstag, 22. Mai 2012 23:24
>> > An: Shadi Abou-Zahra
>> > Cc: Eval TF
>> > Betreff: Re: evaluating web applications (was Re: Canadian Treasury
>> Board
>> > accessibility assessment methodology)
>> >
>> > Shadi,
>> >
>> > I don't believe one can make an effective, useful, meaningful
>> conformance
>> > claim about many classes of web applications today.  That class
>> includes
>> > things like web mail, and many kinds of portal applications
>> (particularly
>> > where they only employ a single URI).
>> >
>> > I do believe it will be possible to evaluate web applications for
>> > accessibility - similar to evaluating non-web applications for
>> accessibility
>> > - but I expect we will need to do something that is different from
>> the
>> > binary "perfection"/"imperfection" of the current conformance claim
>> rubric.
>> > The Canadian Treasury Board example takes a step along that path in
>> shifting
>> > from one binary "perfection"/"imperfection" statement to a two
>> tiered,
>> > percentage collection of 38 binary "perfection"/"imperfection"
>> statements.
>> > But we need to go further than that.
>> >
>> > I think the components of such a successful evaluation will need to:
>> > • Recognize (as EvalTF is already doing) that only a sampling/subset
>> of
>> > everything that a user can encounter can be effectively evaluated in
>> a
>> > finite and reasonable amount of time
>> > • Provide greater granularity in the evaluation reporting - one that
>> is
>> > designed to accommodate the reality of imperfect software while
>> nonetheless
>> > providing useful information to those consuming the evaluation report
>> such
>> > that they can make informed decisions based on it
>> > • Incorporate the concepts (as EvalTF is starting to do) of uses (or
>> use
>> > cases) of the application so that the evaluation is meaningful in the
>> > context of how the web application will be used
>> >
>> > I am eager to get further into these discussions in EvalTF, some of
>> which
>> > may be logical things to discuss as we review feedback from the
>> public draft
>> > (including some of the Oracle feedback... :-).  And as I mentioned,
>> we've
>> > already started exploring some of this already.
>> >
>> >
>> > Peter
>> >
>> >
>> > On 5/22/2012 2:09 PM, Shadi Abou-Zahra wrote:
>> > Hi Peter,
>> >
>> > Does that mean that web applications cannot be evaluated?
>> >
>> > Best,
>> >   Shadi
>> >
>> >
>> > On 22.5.2012 20:40, Peter Korn wrote:
>> >
>> > Shadi,
>> >
>> > As is clear from the Notes&  Examples under their definition of "Web
>> page"
>> > at
>> > the bottom of the URL you circulated (below), it is clear they are
>> looking
>> > to
>> > assess on a Pass/Fail basis the full complexity of web applications.
>> As
>> > we've
>> > explored in recent EvalTF meetings, that is a very challenging thing
>> to do,
>> > given how dynamic web applications can be (cf. their examples of a
>> "Web mail
>> >
>> > program" and a "customizable portal site"). It is challenging in
>> normal
>> > software
>> > testing to determine whether you have reached every possible code
>> path&
>> > every
>> > possible configuration of the structure behind a single URI, let
>> alone
>> > answer
>> > Pass/Fail for each and every WCAG A/AA SC for those.
>> >
>> >
>> > Regards,
>> >
>> > Peter
>> >
>> > On 5/22/2012 6:10 AM, Shadi Abou-Zahra wrote:
>> >
>> >  Dear Group,
>> >
>> >  Ref:<http://www.tbs-sct.gc.ca/ws-nw/wa-aw/wa-aw-assess-methd-
>> eng.asp>
>> >
>> >  David MacDonald pointed out the accessibility assessment methodology
>> of the
>> >
>> >  Canadian Treasury Board, in particular the scoring they use.
>> >
>> >  Best,
>> >  Shadi
>> >
>> > --
>> > Oracle<http://www.oracle.com>
>> > Peter Korn | Accessibility Principal
>> > Phone: +1 650 506 9522<tel:+1%20650%20506%209522>
>> > Oracle Corporate Architecture Group
>> > 500 Oracle Parkway | Redwood City, CA 94065
>> > ---------------------------------------------------------------------
>> -------
>> > ----
>> > Note: @sun.com e-mail addresses will shortly no longer function; be
>> sure to
>> > use:
>> > peter.korn@oracle.com to reach me
>> > ---------------------------------------------------------------------
>> -------
>> > ----
>> > Green Oracle<http://www.oracle.com/commitment>  Oracle is committed
>> to
>> > developing practices and products that help protect the environment
>> >
>> >
>> > --
>> >
>> > Peter Korn | Accessibility Principal
>> > Phone: +1 650 506 9522
>> > Oracle Corporate Architecture Group
>> > 500 Oracle Parkway | Redwood City, CA 94065
>> > ________________________________________
>> > Note: @sun.com e-mail addresses will shortly no longer function; be
>> sure to
>> > use: peter.korn@oracle.com to reach me
>> > ________________________________________
>> > Oracle is committed to developing practices and products that help
>> protect
>> > the environment
>> >

Received on Wednesday, 23 May 2012 08:31:48 UTC