Re: Discussion 5.5 from Alistair Garrison on 2012-01-20 (public-wai-evaltf@w3.org from January 2012)

From: Alistair Garrison <alistair.j.garrison@gmail.com>
Date: Fri, 20 Jan 2012 11:25:28 +0100
To: Vivienne CONWAY <v.conway@ecu.edu.au>, Eval TF <public-wai-evaltf@w3.org>
Message-Id: <96F7D59C-8247-4408-9600-531C5B1E38A6@gmail.com>
Hi Vivienne, 

Great!

In answer to your first question, we developed proprietary software for the purpose of evaluating WCAG 1.0 checkpoints. Back in the day (ten or more years ago) we were initially using Bobby, but its lack of extensibility forced us to build our own tools.  Since then, of course, things have moved on greatly...

In answer to your second question about analytics - the 'most visited pages' would probably only be available to the website owner.  I suppose you would have to ask them for this information.  You would, however, have to ask for the data in a way that would not cause you any legal issues for you (with regard to the evaluation results you gain) if it was found to be fraudulent / manipulated.  For example, "Does the information provided, to the best of your knowledge, provide a true and up-to-date picture of your website...".  Disclaimer: I am not a lawyer and this is not legal advice ;-)  

That said, the possible damage to reputation of someone doing such a thing presumably negates it being done in the first place.  You would in any case hope so... 

And, of course, there's always having trust in people to do the right thing...

All the best 

Alistair

On 20 Jan 2012, at 10:18, Vivienne CONWAY wrote:

> Hi Alistair
> 
> That all sounds good, and very much like the way I work with WCAG 2.0.
> 
> However you mentioned in the previous email that you used some sort of automated tool to locate those pages with specific content you were looking for.  Knowing how you did this would be a huge benefit when assessing very large sites as it's not easy to find things like videos unless you now where they are located.  That's what I'd be most interested in hearing.
> 
> I'd also like to get on top of using analytics to find the most visited pages as I agree that this would be really beneficial.  If you can prove to a client that you've checked the pages that 80% of their users access, they would be most interested to get those fixed.  I know there are specific analytic tools, but I'm not sure if an evaluator can use them on a client's website - doesn't it have to be done internally?  In that case, we would be relying on the client to tell us which pages are most visited and hope they give us all of them.
> 
> 
> Regards
> 
> Vivienne L. Conway, B.IT(Hons), MACS CT
> PhD Candidate & Sessional Lecturer, Edith Cowan University, Perth, W.A.
> Director, Web Key IT Pty Ltd.
> v.conway@ecu.edu.au
> v.conway@webkeyit.com
> Mob: 0415 383 673
> 
> This email is confidential and intended only for the use of the individual or entity named above. If you are not the intended recipient, you are notified that any dissemination, distribution or copying of this email is strictly prohibited. If you have received this email in error, please notify me immediately by return email or telephone and destroy the original message.
> ________________________________________
> From: Alistair Garrison [alistair.j.garrison@gmail.com]
> Sent: Friday, 20 January 2012 6:10 PM
> To: Vivienne CONWAY; Eval TF
> Subject: Re: Discussion 5.5
> 
> Hi Vivienne,
> 
> When assessing WCAG 1.0 - but it is certainly relevant to WCAG 2.0 also...
> 
> We used to collect 20 - 25 representative pages from the website (trying to cover as many checkpoints as possible) - to include Home page, Contact page, top pages in sections and pages representative of different templates. We would also download a large number of the pages in a website.
> 
> We would start testing against the 20 - 25 pages, which for most checkpoints gave representative results - however, for content such as videos, tables, etc... (which may not have been found in the 20 - 25 pages sample) we would find pages which contained such content amongst all the pages downloaded - looking especially for failing content.  These additional pages then became part of the sample, and were also checked against all other checkpoints.
> 
> Using this approach meant you could be satisfied (as far as reasonably possible) that you had tried your best to find at least some examples of all relevant content - if they existed.
> 
> In a nutshell, we were not looking to assess all content in all pages, just 100% of the content in the core sample (20-25 pages) and additional 'specific content type' pages (15-20 pages).
> 
> 99% of all websites failed initially.  They were then given a certain amount of time to correct specified know issues (and improve areas of deficiency across the website), before a re-assessment - which I believe everyone was happy with (people pay evaluators to find issues - so they can be fixed).
> 
> In hindsight, the 20-25 pages should really have been the top 20-25 visited pages (from analytics) - as this requires no judgement, by an evaluator, and is indisputable...  Doing this would have gone a long way to answering Kerstin's "political parties website problem" (AW:Evaluator Errors) - that, and a level of independence / professionalism ;-)
> 
> Hope this helps...
> 
> Alistair
> 
> On 20 Jan 2012, at 08:05, Vivienne CONWAY wrote:
> 
>> HI Alistair
>> I'm not sure what type of automated approach you wold use to select the pages with contain certain criteria.  Could you give me some examples?
>> 
>> 
>> Regards
>> 
>> Vivienne L. Conway, B.IT(Hons), MACS CT
>> PhD Candidate & Sessional Lecturer, Edith Cowan University, Perth, W.A.
>> Director, Web Key IT Pty Ltd.
>> v.conway@ecu.edu.au
>> v.conway@webkeyit.com
>> Mob: 0415 383 673
>> 
>> This email is confidential and intended only for the use of the individual or entity named above. If you are not the intended recipient, you are notified that any dissemination, distribution or copying of this email is strictly prohibited. If you have received this email in error, please notify me immediately by return email or telephone and destroy the original message.
>> ________________________________________
>> From: Alistair Garrison [alistair.j.garrison@gmail.com]
>> Sent: Friday, 20 January 2012 2:17 AM
>> To: Detlev Fischer; Eval TF
>> Subject: Re: Discussion 5.5
>> 
>> Hi Detlev,
>> 
>> At this stage I was talking only about an approach to finding relevant content - trying to define a replicable evaluation methodology is the big picture into which it sits (and I hope our goal).
>> 
>> One approach for finding relevant content (as has been mentioned) might be to simply select a sample of top pages (lets say 20 - to include home, etc...) and then use an automated approach for finding other pages which contain content relevant to each criteria being evaluated.  Just an example, but for a website this type of methodical approach would at least lead to a level of consistency - and a reduced margin of error when we say content of the type x does not exist, therefore certain criteria are not applicable.
>> 
>> Hope this clarifies things for you.
>> 
>> All the best
>> 
>> Alistair
>> 
>> On 19 Jan 2012, at 17:48, Detlev Fischer wrote:
>> 
>>> Am 19.01.2012 16:06, schrieb Alistair Garrison:
>>>> Hi Eric, Eval TF,
>>>> 
>>>> If we define a methodical approach for finding relevant content -
>>>> two people using this approach on the same set of pages should
>>>> discover the same errors.>
>>> 
>>> Hi Alistair,
>>> 
>>> I think that kind of replicability can only be achieved if there is a detailed descripiton in the test procedure of what to check and in what context. This would need to include the setting of a benchmark test suite (hardware, browser(s) and version(s) used) - even, for some checks, viewport size - and checks would only be replicable (ideally) if another tester uses the same suite.
>>> If we abstain from that, fine, but I can't see how one might discover the same errors without being specific. Example: Text zoom may work fine in a certain browser and viewpoint size, and lead to overlaps in another setting where all possible techniques for text resize may fail SC 1.4.4.
>>> 
>>> Or how would you go about achieving replicability? I am not sure I understand your approach.
>>> 
>>> Regards,
>>> Detlev
>>> 
>>>> If after using this methodical approach no more relevant content can be found, and there are no errors in the relevant content - what is under test must have passed, leading an evaluator to say if something conforms.
>>>> 
>>>> However, there is still uncertainty about any further undiscovered content - that doubt stems from how effective our methodical approach was in the first place, and how fool proof it was to implement.  Ensuring it is the best it can be is our responsibility.  I suppose an error margin might be expressed for our methodical approach - we could say using this approach should find 95% of all content - should be 99%...
>>>> 
>>>> However, an evaluator would still need to have some sort of disclaimer.
>>>> 
>>>> Thoughts to inject into the telecon.
>>>> 
>>>> Alistair
>>>> 
>>>> On 19 Jan 2012, at 15:24, Velleman, Eric wrote:
>>>> 
>>>>> This could mean that it is practicly impossible to reach full conformance with WCAG2.0... A good evaluator can always find an error somewhere is my experience. Whe may have to accept that people make errors. Everything has an error margin. Even safety requirements have an error margin... Even the chip industry, LCD panels have error margins..
>>>>> Kindest regards,
>>>>> 
>>>>> Eric
>>>>> 
>>>>> 
>>>>> 
>>>>> ________________________________________
>>>>> Van: Alistair Garrison [alistair.j.garrison@gmail.com]
>>>>> Verzonden: donderdag 19 januari 2012 14:19
>>>>> Aan: Velleman, Eric; Eval TF
>>>>> Onderwerp: Re: Discussion 5.5
>>>>> 
>>>>> Dear Eric, Eval TF,
>>>>> 
>>>>> I vote not to allow error margins - for the reason I outlined in my previous email on this subject.
>>>>> 
>>>>> Instead, I would continue to support a simple disclaimer such as "The evaluator has tried their hardest to minimise the margin for error by actively looking for all content relevant to each technique being assessed which might have caused a fail."
>>>>> 
>>>>> Occam's razor - simplest is best...
>>>>> 
>>>>> Alistair
>>>>> 
>>>>> On 19 Jan 2012, at 13:58, Velleman, Eric wrote:
>>>>> 
>>>>>> Dear all,
>>>>>> 
>>>>>> For the Telco today:
>>>>>> We have seen a lot of discussion on 5.5 Error Margin. As indicated in the discussion, it also depends on other things like the size of the sample, the complexity of the website and the qualities of the evaluator, use of tools (for collecting pages, making a first check) etc. etc. But we need to be agree on:
>>>>>> 
>>>>>> Do we allow errors or not?
>>>>>> 
>>>>>> If not, life is easy
>>>>>> If yes, we need to describe under what conditions
>>>>>> 
>>>>>> Kindest regards,
>>>>>> 
>>>>>> Eric
>>>>>> 
>>>>>> =========================
>>>>>> Eric Velleman
>>>>>> Technisch directeur
>>>>>> Stichting Accessibility
>>>>>> Universiteit Twente
>>>>>> 
>>>>>> Oudenoord 325,
>>>>>> 3513EP Utrecht (The Netherlands);
>>>>>> Tel: +31 (0)30 - 2398270
>>>>>> www.accessibility.nl / www.wabcluster.org / www.econformance.eu /
>>>>>> www.game-accessibility.com/ www.eaccessplus.eu
>>>>>> 
>>>>>> Lees onze disclaimer: www.accessibility.nl/algemeen/disclaimer
>>>>>> Accessibility is Member van het W3C
>>>>>> =========================
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> This e-mail is confidential. If you are not the intended recipient you must not disclose or use the information contained within. If you have received it in error please return it to the sender via reply e-mail and delete any record of it from your system. The information contained within is not the opinion of Edith Cowan University in general and the University accepts no liability for the accuracy of the information provided.
>> 
>> CRICOS IPC 00279B
> 
> This e-mail is confidential. If you are not the intended recipient you must not disclose or use the information contained within. If you have received it in error please return it to the sender via reply e-mail and delete any record of it from your system. The information contained within is not the opinion of Edith Cowan University in general and the University accepts no liability for the accuracy of the information provided.
> 
> CRICOS IPC 00279B
Received on Friday, 20 January 2012 10:26:05 UTC