RE: AW: Requirements draft from Vivienne CONWAY on 2011-09-13 (public-wai-evaltf@w3.org from September 2011)

From: Vivienne CONWAY <v.conway@ecu.edu.au>
Date: Tue, 13 Sep 2011 12:19:12 +0800
To: RichardWarren <richard.warren@userite.com>, Kerstin Probiesch <k.probiesch@googlemail.com>, "'Detlev Fischer'" <fischer@dias.de>, "public-wai-evaltf@w3.org" <public-wai-evaltf@w3.org>
Message-ID: <8AFA77741B11DB47B24131F1E38227A98CAAEDA2C4@XCHG-MS1.ads.ecu.edu.au>
HI all
I am in agreement with leaving n R03 and R04 in as they stand.


Regards

Vivienne L. Conway
________________________________________
From: public-wai-evaltf-request@w3.org [public-wai-evaltf-request@w3.org] On Behalf Of RichardWarren [richard.warren@userite.com]
Sent: Monday, 12 September 2011 10:12 PM
To: Kerstin Probiesch; 'Detlev Fischer'; public-wai-evaltf@w3.org
Subject: Re: AW: Requirements draft

Hi Both, and all,
I am very concerned that if we start off with "wishy-washy" requirements we
will not be able to deliver a standardised methodology. I believe our task
is to create clear and replicable methods. If, by doing so we set the bar
too high then (and only then) we can explore "tolerance" and "scoping" etc.
Remember it is the Guidelines we are evaluating, not the success criteria
(SC). No doubt our methods will employ SCs and we will have to work out how
to cope with (aggregate etc.) qualitative results.

I agree with Kirsten that *R03 and *R04 are vital and should not be dropped.

Richard

-----Original Message-----
From: Kerstin Probiesch
Sent: Monday, September 12, 2011 2:33 PM
To: 'Detlev Fischer' ; public-wai-evaltf@w3.org
Subject: AW: Requirements draft

Hi again Detlev, all,

I don't get the point. Pat of a methodology is always managing variance.
Especially for that we need to follow the three main Criteria of Quality for
tests. Reliability has different levels (I'm not sure if this is the correct
word for what I mean), e.g. high, low, no and of course we deal always with
the question: is it "reliable enough"? This just for this Criteria. I think
this was meant by "within a given tolerance", which *is* managing variance.

Kerstin

> -----Ursprüngliche Nachricht-----
> Von: public-wai-evaltf-request@w3.org [mailto:public-wai-evaltf-
> request@w3.org] Im Auftrag von Detlev Fischer
> Gesendet: Montag, 12. September 2011 14:16
> An: public-wai-evaltf@w3.org
> Betreff: Re: Requirements draft
>
> Hi Kerstin, hi everyone else,
>
> my point was simply that empirically, I think replicability of
> something
> as complex as evaluating a website against WCAG 2.0 will be the
> exception even in the best of circumstances. That does not mean I am
> against trying to define a common basis (a methodology for testing).
> However, I believe our methodology cannot and should not be as
> deterministic as a standard like HTML or CSS. Every process involving a
> large amount of human experience and contextual judgement will produce
> variance in results. I think the methodology should *manage* rather
> than
> will away this variance, e.g., by coming up with credible ways of
> aggregating / arbitraiting / validating human testing results. So I
> believe looking critically at requirements for unique interpretation
> and
> replicability is the exact opposite of being content with "Tipps for
> testing" - it actually raises the bar by checking theory against the
> reality of testing complex, real-word sites.
>
> Detlev
>
> Am 12.09.2011 13:47, schrieb Kerstin Probiesch:
> > Hi Detlev, all,
> >
> > I commented already most of the suggested Requirements. Just a few
> words as a comment to Detlev's comments and just for two Requirements.
> Please see the other comments from Detlev in his mail and my other
> comments also (in a few day). Sorry for going this was, but I want to
> comment two very important points in one paragraph.
> >
> > If we would drop R04 we would fail in the minimum one international
> Criteria for the quality of tests in general: Reliability. To drop R03
> is critical for the second Criteria for the quality of tests:
> Objectivity. Without Reliability no Validity which is the third
> important Criteria. If just one Criteria fails the W3C can't claim the
> evaluation methodology as standardized. The result of our work will be
> a *non-standardized* evaluation methodology as a Recommendation coming
> from W3C as main international *standards* organization. I fear the
> result of our work will then have the character of some "Tipps for
> testing".
> >
> > Kerstin
> >
> >>> R03: Unique interpretation
> >>> Comment (RW) : I think this means that it should be unambiguous,
> that
> >>> means it is not open to different interpretations. I am pretty sure
> >> that the W3C has a standard clause it uses to cover this point when
> >> building standards etc. Hopefully Shadi can find it<Grin>  . This
> also implies
> >>> use of standard terminology which we should be looking at as soon
> as
> >>> possible so that terms like “atomic testing” do not creep into our
> >>> procedures without clear /agreed definitions.
> >>
> >> DF: I have spent some time arguing that the testing of many SC is
> not a
> >> black&  white thing (1.3.1 headings, 1.1.1 alt text, etc),
> especially
> >> if we aggregate results for all "atomic" (sorry) instances on a page
> level
> >> and use the page as unit to be evaluated. I have not seen much
> reaction
> >> to that by others so far.
> >> I would drop R03 as unrealistic.
> >
> >>> R04: Replicability: different Web accessibility evaluators who
> >> perform
> >>> the same tests on the same site should get the same results within
> a
> >>> given tolerance.
> >>> Comment (RW) : The first part is good, but I am not happy with
> >>> introducing “tolerance” at this stage. I think we should be clear
> >> that we are after consistent, replicable tests. I think we should
> add
> >>> separate requirement later for such things as “partial compliance”
> >> and “tolerance. See R14 below.
> >>>
> >>> *R04: Replicability: different Web accessibility evaluators who
> >> perform
> >>> the same tests on the same site should get the same results.
> >>
> >> DF: I think I know this will never happen UNLESS people use the same
> >> closely defined step-by-step process AND have a common / shared
> >> understanding as to what constitutes a failure or success across a
> >> range of different implementations. Even then, exact replicability
> will be
> >> the exception. If the method we aim for should be generic and there
> is no element of
> >> arbitraiton between testers and no validation by a (virtual)
> community,
> >> no chance of replicability, im my opinion.
> >> I would drop R04 as unrealistic.
> >
> >
>
>
> --
> ---------------------------------------------------------------
> Detlev Fischer PhD
> DIAS GmbH - Daten, Informationssysteme und Analysen im Sozialen
> Geschäftsführung: Thomas Lilienthal, Michael Zapp
>
> Telefon: +49-40-43 18 75-25
> Mobile: +49-157 7-170 73 84
> Fax: +49-40-43 18 75-19
> E-Mail: fischer@dias.de
>
> Anschrift: Schulterblatt 36, D-20357 Hamburg
> Amtsgericht Hamburg HRB 58 167
> Geschäftsführer: Thomas Lilienthal, Michael Zapp
> ---------------------------------------------------------------

This e-mail is confidential. If you are not the intended recipient you must not disclose or use the information contained within. If you have received it in error please return it to the sender via reply e-mail and delete any record of it from your system. The information contained within is not the opinion of Edith Cowan University in general and the University accepts no liability for the accuracy of the information provided.

CRICOS IPC 00279B
Received on Tuesday, 13 September 2011 04:20:51 UTC