Re: EvalTF Size of Sample from Detlev Fischer on 2012-01-05 (public-wai-evaltf@w3.org from January 2012)

From: Detlev Fischer <fischer@dias.de>
Date: Thu, 05 Jan 2012 14:02:09 +0100
To: public-wai-evaltf@w3.org
Message-ID: <4F059F51.90206@dias.de>
Just my 2 cents regarding the sampling approach:

Testing is expensive and gets more expensive with larger samples. So 
there is a trade-off between thoroughness and the viability of testing. 
For many, especially smaller and non-commercial site owners, a test with 
a cost calculated on the basis of a reasonable hourly rate quickly gets 
too expensive - and then it will not happen.

Let me simplify for the sake of the argument. It comes down to the 
expectation towards the methodology:

A. Is the ultimate aim to reveal all significant practical
    a11y issues for the chosen scope, the real barriers, without
    the imperative to detect even the slightest violation somewhere?
    Unless the site is very large and diverse, this can often be
    achieved based on a relatively small page sample composed during
    an initial survey of the site, taking the core pages/templates and
    then actively looking-out for 'issues' and, of course,
    crucial processes. For a simple site, 4-6 pages may do the job.
    (For more complex ones, you will need more, of course.)

B. Is the aim to check for full compliance on the chosen level of WCAG
    rigorously, using a a rather large sample (core, task-based *and*
    random), trying to making sure no violation goes undetected?

It seems clear that option B will cost a lot, with the consequence that 
only large organisations and those that absolutely must comply will 
commission such a test.

And even those sites that try hard to meet WCAG will be frustrated when 
they discover that despite their best efforts, they will often still not 
manage to achieve strict and full compliance with option B even on level 
A (think of the multitude of ways editorial content can fail SC 1.3.1, 
if criteria are applied strictly).

Why? Given the nature of diverse and multi-source content in many modern 
sites ("Web 2.0") it is obvious that the overall test result would 
usually be "fail" in terms of strict conformance, unless there is an 
agreement that minor violations need not rule out the statement of WCAG 
on the chosen level (this is what we have been calling 'tolerance 
metrics' without being quite clear how it would actually work).
And to be sure, getting a WCAG conformance statement and seal for the 
selected scope will often be the main incentive for site owners making a 
real effort to conform and then commissioning a test.

When testing sites that make *no* effort to conform, even a small sample 
will quickly reveal that WCAG is not met, and by a large margin. Here, 
having a larger sample may further drag down results a bit, but the 
verdict is already very clear.

I think that any practically applicable testing approach must strike a 
balance between relevance of results and the effort invested. If we 
raise the bar and try to be very strict and thorough, if we use just 
pass and fail and every instance fail fails the page and every page fail 
fails conformance for the chosen SC and scope, we are in for a very 
frustrating experience.

This is why BITV-Test applies a more fine-grained approach to rating in 
order to differentiate between the pretty good, the not so good, and the 
pretty awful - compare our paper at the recent accessibility metrics 
symposium:
http://www.w3.org/WAI/RD/2011/metrics/paper7/

 From a black-or-white conformance standpoint, all this talk about the 
'degree of conformance' may seem irrelevant. For the customer of a test, 
however, having a measure of the degree of accessibility based on an 
established benchmark like WCAG is important (and, of course, having a 
list of issues for the designers to work through, a list which both 
approaches can equally supply).

Regards,
Detlev


Am 04.01.2012 17:08, schrieb Velleman, Eric:
> Hi Kathy,
>
> This could be related to the barrier probability: would you find more if you look at more pages?
> We have to consider how comparable the results will be if we do not use what seems to be the minimum 25 pages.
> We will cover barrier probability in a later part of the Methodology.
> Kindest regards,
>
> Eric
>
> =========================
> Eric Velleman
> Technisch directeur
> Stichting Accessibility
> Universiteit Twente
>
> Oudenoord 325,
> 3513EP Utrecht (The Netherlands);
> Tel: +31 (0)30 - 2398270
> www.accessibility.nl / www.wabcluster.org / www.econformance.eu /
> www.game-accessibility.com/ www.eaccessplus.eu
>
> Lees onze disclaimer: www.accessibility.nl/algemeen/disclaimer
> Accessibility is Member van het W3C
> =========================
>
> ________________________________________
> Van: Kathy Wahlbin [kathy@interactiveaccessibility.com]
> Verzonden: woensdag 4 januari 2012 15:44
> Aan: Velleman, Eric; public-wai-evaltf@w3.org
> Onderwerp: RE: EvalTF Size of Sample
>
> Hi Eric -
>
> I agree with you that the three sample lists should be included.  I think
> that the number of pages will vary based on the website purpose, size and
> functions.  It will be very difficult to say that you should evaluate at
> least 25 pages on every site.
>
> That said, here are the questions that I have:
>
> 1.  Should we consider defining the number of pages in a sample based
> different scenarios or types of websites?
>
> 2.  To defined size, do we need to talk about the number of pages or could
> we define it like you have done with the three sample lists and let the
> evaluator determine from those lists how many pages should be reviewed?
>
> Kathy
>
> Phone:  978.443.0798
> Cell:  978.760.0682
> Fax:  978.560.1251
> KathyW@ia11y.com
>
>
>
> NOTICE: This communication may contain privileged or other confidential
> information. If you are not the intended recipient, please reply to the
> sender indicating that fact and delete the copy you received. Thank you.
>
> -----Original Message-----
> From: Velleman, Eric [mailto:evelleman@bartimeus.nl]
> Sent: Tuesday, January 03, 2012 6:15 PM
> To: public-wai-evaltf@w3.org
> Subject: EvalTF Size of Sample
>
> Dear all,
>
> We had a short discussion in our last telco about the size of a
> representative sample of a website.
> The discussion varied between 6 and 5000... How many pages should we have in
> a sample to be replicable and at the same time achievable in terms of time
> and money?
>
> In the updated version of the Methodology that will be online in the coming
> days I added:
> - the three sample lists that should be included into an evaluation (core,
> task-based and random). If we just take one of every resource named there,
> we have at least 25 pages.
>
> Kindest regards,
>
> Eric
>
> =========================
> Eric Velleman
> Technisch directeur
> Stichting Accessibility
> Universiteit Twente
>
> Oudenoord 325,
> 3513EP Utrecht (The Netherlands);
> Tel: +31 (0)30 - 2398270
> www.accessibility.nl / www.wabcluster.org / www.econformance.eu /
> www.game-accessibility.com/ www.eaccessplus.eu
>
> Lees onze disclaimer: www.accessibility.nl/algemeen/disclaimer
> Accessibility is Member van het W3C
> =========================
>
>
>
>
Received on Thursday, 5 January 2012 13:09:14 UTC