AW: Requirements draft from Kerstin Probiesch on 2011-09-12 (public-wai-evaltf@w3.org from September 2011)

From: Kerstin Probiesch <k.probiesch@googlemail.com>
Date: Mon, 12 Sep 2011 13:47:53 +0200
To: "'Detlev Fischer'" <fischer@dias.de>, <public-wai-evaltf@w3.org>
Message-ID: <4e6df0cd.4329df0a.398f.2703@mx.google.com>

Hi Detlev, all,

I commented already most of the suggested Requirements. Just a few words as a comment to Detlev's comments and just for two Requirements. Please see the other comments from Detlev in his mail and my other comments also (in a few day). Sorry for going this was, but I want to comment two very important points in one paragraph.

If we would drop R04 we would fail in the minimum one international Criteria for the quality of tests in general: Reliability. To drop R03 is critical for the second Criteria for the quality of tests: Objectivity. Without Reliability no Validity which is the third important Criteria. If just one Criteria fails the W3C can't claim the evaluation methodology as standardized. The result of our work will be a *non-standardized* evaluation methodology as a Recommendation coming from W3C as main international *standards* organization. I fear the result of our work will then have the character of some "Tipps for testing".

Kerstin 

> > R03: Unique interpretation
> > Comment (RW) : I think this means that it should be unambiguous, that
> > means it is not open to different interpretations. I am pretty sure
> that the W3C has a standard clause it uses to cover this point when
> building standards etc. Hopefully Shadi can find it <Grin> . This also implies
> > use of standard terminology which we should be looking at as soon as
> > possible so that terms like “atomic testing” do not creep into our
> > procedures without clear /agreed definitions.
> 
> DF: I have spent some time arguing that the testing of many SC is not a
> black & white thing (1.3.1 headings, 1.1.1 alt text, etc), especially
> if we aggregate results for all "atomic" (sorry) instances on a page level
> and use the page as unit to be evaluated. I have not seen much reaction
> to that by others so far.
> I would drop R03 as unrealistic.

> > R04: Replicability: different Web accessibility evaluators who
> perform
> > the same tests on the same site should get the same results within a
> > given tolerance.
> > Comment (RW) : The first part is good, but I am not happy with
> > introducing “tolerance” at this stage. I think we should be clear
> that we are after consistent, replicable tests. I think we should add
> > separate requirement later for such things as “partial compliance”
> and “tolerance. See R14 below.
> >
> > *R04: Replicability: different Web accessibility evaluators who
> perform
> > the same tests on the same site should get the same results.
> 
> DF: I think I know this will never happen UNLESS people use the same
> closely defined step-by-step process AND have a common / shared
> understanding as to what constitutes a failure or success across a
> range of different implementations. Even then, exact replicability will be
> the exception. If the method we aim for should be generic and there is no element of
> arbitraiton between testers and no validation by a (virtual) community,
> no chance of replicability, im my opinion.
> I would drop R04 as unrealistic.

Received on Monday, 12 September 2011 11:45:57 UTC