Re: EvalTF Size of Sample

Hi all,

B has also my vote. It could be a good compromise of bringing two different approaches together, if we work it out proper.

The costs will depend on how the test will look like in detail and if consulting aspects are included or not. 

 Anyway. I agree that probable costs should not influence us - as Alistair said. The most important thing is the quality of the methodology.

Best

Kerstin





Am 05.01.2012 um 23:54 schrieb "Velleman, Eric" <evelleman@bartimeus.nl>:

> Hi Detlev and Alistair,
> 
> B has my vote also. At the same time I see the practical issues in time and potential cost. Nevertheless, our focus is on full conformance evaluation of WCAG 2.0. 
> 
> In the discussion we had this afternoon we looked at some interesting approaches to minimize the need for many pages in the sample. One interesting approach was that if a certain number of specific pages proves to be OK, the evaluator could focus on checking only elements on other pages (like datatables etc.). 
> 
> We will then need to devise clear criteria for enlarging the sample if a site is larger or more complex or has more elements and technologies.
> 
> Kindest regards,
> 
> Eric
> 
> 
> 
> ________________________________________
> Van: Alistair Garrison [alistair.j.garrison@gmail.com]
> Verzonden: donderdag 5 januari 2012 14:59
> Aan: Detlev Fischer
> CC: public-wai-evaltf@w3.org
> Onderwerp: Re: EvalTF Size of Sample
> 
> Hi Detlev, all,
> 
> B gets my vote...
> 
> As WCAG 2.0 is meant to be testable the need for our judgement as to what constitutes a pass or fail will hopefully be diminished.  If we don't test rigorously, and don't actively look for content which is relevant to each technique / checkpoint we will not, to my mind, be doing a proper "replicable" job and paying due respect to WCAG's audience.  Allowing leeway with regard to compliance might be something done with regard to the award of proprietary accessibility seals or badges, but surely this W3C WCAG 2.0 Evaluation Methodology should aim to determine as far as absolutely possible a website's conformance with the chosen level of WCAG 2.0 or the website's conformance claim.
> 
> Lastly, I would like to say that cost should not be used to influence this methodology - things change, things get optimised and there will always be someone willing to do something for less - if the methodology is easy to use a client could even do it themselves. To my mind let's concentrate of the best way to determine a website's conformance with WCAG 2.0.  Again, B gets my vote...
> 
> All the best
> 
> Alistair
> 
> On 5 Jan 2012, at 14:02, Detlev Fischer wrote:
> 
>> Just my 2 cents regarding the sampling approach:
>> 
>> Testing is expensive and gets more expensive with larger samples. So there is a trade-off between thoroughness and the viability of testing. For many, especially smaller and non-commercial site owners, a test with a cost calculated on the basis of a reasonable hourly rate quickly gets too expensive - and then it will not happen.
>> 
>> Let me simplify for the sake of the argument. It comes down to the expectation towards the methodology:
>> 
>> A. Is the ultimate aim to reveal all significant practical
>>  a11y issues for the chosen scope, the real barriers, without
>>  the imperative to detect even the slightest violation somewhere?
>>  Unless the site is very large and diverse, this can often be
>>  achieved based on a relatively small page sample composed during
>>  an initial survey of the site, taking the core pages/templates and
>>  then actively looking-out for 'issues' and, of course,
>>  crucial processes. For a simple site, 4-6 pages may do the job.
>>  (For more complex ones, you will need more, of course.)
>> 
>> B. Is the aim to check for full compliance on the chosen level of WCAG
>>  rigorously, using a a rather large sample (core, task-based *and*
>>  random), trying to making sure no violation goes undetected?
>> 
>> It seems clear that option B will cost a lot, with the consequence that only large organisations and those that absolutely must comply will commission such a test.
>> 
>> And even those sites that try hard to meet WCAG will be frustrated when they discover that despite their best efforts, they will often still not manage to achieve strict and full compliance with option B even on level A (think of the multitude of ways editorial content can fail SC 1.3.1, if criteria are applied strictly).
>> 
>> Why? Given the nature of diverse and multi-source content in many modern sites ("Web 2.0") it is obvious that the overall test result would usually be "fail" in terms of strict conformance, unless there is an agreement that minor violations need not rule out the statement of WCAG on the chosen level (this is what we have been calling 'tolerance metrics' without being quite clear how it would actually work).
>> And to be sure, getting a WCAG conformance statement and seal for the selected scope will often be the main incentive for site owners making a real effort to conform and then commissioning a test.
>> 
>> When testing sites that make *no* effort to conform, even a small sample will quickly reveal that WCAG is not met, and by a large margin. Here, having a larger sample may further drag down results a bit, but the verdict is already very clear.
>> 
>> I think that any practically applicable testing approach must strike a balance between relevance of results and the effort invested. If we raise the bar and try to be very strict and thorough, if we use just pass and fail and every instance fail fails the page and every page fail fails conformance for the chosen SC and scope, we are in for a very frustrating experience.
>> 
>> This is why BITV-Test applies a more fine-grained approach to rating in order to differentiate between the pretty good, the not so good, and the pretty awful - compare our paper at the recent accessibility metrics symposium:
>> http://www.w3.org/WAI/RD/2011/metrics/paper7/
>> 
>> From a black-or-white conformance standpoint, all this talk about the 'degree of conformance' may seem irrelevant. For the customer of a test, however, having a measure of the degree of accessibility based on an established benchmark like WCAG is important (and, of course, having a list of issues for the designers to work through, a list which both approaches can equally supply).
>> 
>> Regards,
>> Detlev
>> 
>> 
>> Am 04.01.2012 17:08, schrieb Velleman, Eric:
>>> Hi Kathy,
>>> 
>>> This could be related to the barrier probability: would you find more if you look at more pages?
>>> We have to consider how comparable the results will be if we do not use what seems to be the minimum 25 pages.
>>> We will cover barrier probability in a later part of the Methodology.
>>> Kindest regards,
>>> 
>>> Eric
>>> 
>>> =========================
>>> Eric Velleman
>>> Technisch directeur
>>> Stichting Accessibility
>>> Universiteit Twente
>>> 
>>> Oudenoord 325,
>>> 3513EP Utrecht (The Netherlands);
>>> Tel: +31 (0)30 - 2398270
>>> www.accessibility.nl / www.wabcluster.org / www.econformance.eu /
>>> www.game-accessibility.com/ www.eaccessplus.eu
>>> 
>>> Lees onze disclaimer: www.accessibility.nl/algemeen/disclaimer
>>> Accessibility is Member van het W3C
>>> =========================
>>> 
>>> ________________________________________
>>> Van: Kathy Wahlbin [kathy@interactiveaccessibility.com]
>>> Verzonden: woensdag 4 januari 2012 15:44
>>> Aan: Velleman, Eric; public-wai-evaltf@w3.org
>>> Onderwerp: RE: EvalTF Size of Sample
>>> 
>>> Hi Eric -
>>> 
>>> I agree with you that the three sample lists should be included.  I think
>>> that the number of pages will vary based on the website purpose, size and
>>> functions.  It will be very difficult to say that you should evaluate at
>>> least 25 pages on every site.
>>> 
>>> That said, here are the questions that I have:
>>> 
>>> 1.  Should we consider defining the number of pages in a sample based
>>> different scenarios or types of websites?
>>> 
>>> 2.  To defined size, do we need to talk about the number of pages or could
>>> we define it like you have done with the three sample lists and let the
>>> evaluator determine from those lists how many pages should be reviewed?
>>> 
>>> Kathy
>>> 
>>> Phone:  978.443.0798
>>> Cell:  978.760.0682
>>> Fax:  978.560.1251
>>> KathyW@ia11y.com
>>> 
>>> 
>>> 
>>> NOTICE: This communication may contain privileged or other confidential
>>> information. If you are not the intended recipient, please reply to the
>>> sender indicating that fact and delete the copy you received. Thank you.
>>> 
>>> -----Original Message-----
>>> From: Velleman, Eric [mailto:evelleman@bartimeus.nl]
>>> Sent: Tuesday, January 03, 2012 6:15 PM
>>> To: public-wai-evaltf@w3.org
>>> Subject: EvalTF Size of Sample
>>> 
>>> Dear all,
>>> 
>>> We had a short discussion in our last telco about the size of a
>>> representative sample of a website.
>>> The discussion varied between 6 and 5000... How many pages should we have in
>>> a sample to be replicable and at the same time achievable in terms of time
>>> and money?
>>> 
>>> In the updated version of the Methodology that will be online in the coming
>>> days I added:
>>> - the three sample lists that should be included into an evaluation (core,
>>> task-based and random). If we just take one of every resource named there,
>>> we have at least 25 pages.
>>> 
>>> Kindest regards,
>>> 
>>> Eric
>>> 
>>> =========================
>>> Eric Velleman
>>> Technisch directeur
>>> Stichting Accessibility
>>> Universiteit Twente
>>> 
>>> Oudenoord 325,
>>> 3513EP Utrecht (The Netherlands);
>>> Tel: +31 (0)30 - 2398270
>>> www.accessibility.nl / www.wabcluster.org / www.econformance.eu /
>>> www.game-accessibility.com/ www.eaccessplus.eu
>>> 
>>> Lees onze disclaimer: www.accessibility.nl/algemeen/disclaimer
>>> Accessibility is Member van het W3C
>>> =========================
>>> 
>>> 
>>> 
>>> 
>> 
>> 
> 
> 
> 
> 

Received on Thursday, 5 January 2012 23:55:42 UTC