Re: possible use of test assertions in defining/expressing requirements? from Detlev Fischer on 2011-09-08 (public-wai-evaltf@w3.org from September 2011)

From: Detlev Fischer <fischer@dias.de>
Date: Thu, 08 Sep 2011 17:24:51 +0200
To: public-wai-evaltf@w3.org
Message-ID: <4E68DE43.8090701@dias.de>
Hello Denis and EVAL TF folks,

what Denis describes is very similar to the approach taken by the German 
BITV-Test / http://www.bitvtest.eu/bitv_test/intro/overview.html , in 
that there is also a number of single tests grouped under SC that add up 
to 100 points.

(Eric, maybe you want to include links to BITV-Test and the mapping link 
below as references of extant test procedures / methodologies?)

We equally use 90 points as line above which a site is considered to 
have good accessibility. There is however a means to downgrade the 
entire result to "badly accessible" if critical accessibility flaws have 
been encountered, even if points would add up to over 90 (which doesn't 
happen in practice, however - sites with critical flaws usually fail to 
meet quite a few other SC as well).

For the latest version of the test I have attempted a complete mapping 
of WCAG 2.0 techniques and failures to our checkpoints:
http://www.bitvtest.eu/mapping-complete

The one thing where we have taken a different approach is screen reader 
/ AT tests, both for conceptual and practical reasons.

Conceptual: AT output and accessibility support varies a lot across AT, 
which limits the relevance of any specific test. Of course one could 
define a reference instalation of UA/AT but results will often not carry 
over to the majority of the installed base, especially regarding all the 
dynamic stuff / and less well supported wai-aria properties.

Practical: Our test just requires accessibility knowledge and HTML/CSS 
skills and uses free tools. No JAWS licence and skills needed. Extending 
a test to require a working knowledge of AT raises the bar considerably 
and/or limits the pool of testers that will be qualified enough to do 
both expert test and screen reader test. Or you have two separate tests, 
one carried out by the expert and one by a screen reader user, and a 
somewhat tricky mapping is needed to reconcile and combine the results. 
Doable, but quite complex & expensive.

And cost is an important aspect: just looking at our own test, a full 
tandem test costs 1200 Euro or more (and that barely covers the actual 
effort at acceptable rates). Many organisations interested in testing do 
not have big budgets, so making the test more complex by mandating AT 
tests has the drawback of increasing cost. The effect would be that even 
more orgs that would like a test don't do it simply because they cannot 
afford it. This already happens at the current cost level.

Detlev

Am 08.09.2011 15:48, schrieb Denis Boudreau:

> Hello EvalTF folks,
>
> Trying to catch up with the threads. I wanted to first jump in on the
> idea of "percentage of compliance" that was brought up a little while back.
>
> We've used this idea of "percentage of compliance" since 2003. At first,
 > we didn't want to go there because compliance is a very binary concept:
> you either comply or you don't. But it turned out that our clients needed
> to know/understand where they were positioning themselves in regards some
> sort of "accessibility goal".
>
> We quickly realized needed to come up with a way for them to understand how
> much they had accomplished and how much still needed to be done in order to
> reach "full compliance". Hence, a note from 1 to 100 seemed like the right
> thing to do, something everyone would appreciate and understand.
>
> Like most of you I'm sure, we came up with a series of tests that were
> mapped to the different success criteria, grouped under different
> guidelines. As years went by and WCAG 2.0 became more and more likely to
> make it into a full blown recommendation, we ended up grouping all these
> tests under each SC, which in turn, were grouped under Guidelines, and then
> grouped under principles. We came up with a form of weighting that allowed
> us to determine (all too subjectively) which tests or SC were "more
> important"  than others and brought the results over to 100 points.
>
> We then decided that a website could be deemed "accessible enough" if each
> and every page audited got at least 90%. We made sure functional testing
> with screen readers was an integral part of this process and that this
> test was worth 10 points out of the 100. That way, it was guaranteed that
> in order to meet our qualification level, a website would prove to be a
> positive experience using various screen readers. This is still what we're
> doing to this day, with each relevant standard we audit on.
>
> The problem with this method is that we're supporting the idea that
> compliance can be scaled while it just cannot. In reality, a really
> accessible website that would score 99% on our evaluation would be highly
> accessible, no doubt about that and most users would not experience any
> problem using it. However, if it's missing that 1%, it is not compliant
> with WCAG 2.0 as a whole even though it is compliant with probably all
> SC but one.
>
> We never found a way to address this problem and no matter what
> methodology we end up building or using, I think it would be great to make
> sure this group doesn't make the same "mistake" we did as it sends the
> wrong message out there.
>
> Best,
>


-- 
---------------------------------------------------------------
Detlev Fischer PhD
DIAS GmbH - Daten, Informationssysteme und Analysen im Sozialen
Geschäftsführung: Thomas Lilienthal, Michael Zapp

Telefon: +49-40-43 18 75-25
Mobile: +49-157 7-170 73 84
Fax: +49-40-43 18 75-19
E-Mail: fischer@dias.de

Anschrift: Schulterblatt 36, D-20357 Hamburg
Amtsgericht Hamburg HRB 58 167
Geschäftsführer: Thomas Lilienthal, Michael Zapp
---------------------------------------------------------------
Received on Thursday, 8 September 2011 15:25:19 UTC