RE: some initial questions from the previous thread from fischer@dias.de on 2011-08-23 (public-wai-evaltf@w3.org from August 2011)

From: <fischer@dias.de>
Date: Tue, 23 Aug 2011 17:04:07 +0200
To: public-wai-evaltf@w3.org
Message-ID: <20110823170407.15294gbkdn6bob53@webmail.dias.de>
Quoting Vivienne CONWAY <v.conway@ecu.edu.au>:

> HI all
> Just thought I'd weigh in on this one as I'm currently puzzling over  
> the issue of how to score websites.  I'm just about to start a  
> research project where I'll have over 100 websites assessed monthly  
> over a period of 2 + years.

If you will be doing this on your own or without team this work  
programme translates to checking more than 4-5 sites per day! And if  
the compliance level is AA you probably need to focus on some key  
requirements, especially those where a failure would make a site  
completely inaccessible to some population. Just looking at WCAG  
success criteria, these may be the ones which most often exclude  
people, ordered by importance from testing experience(feel free to  
disagree):

* Lack of keyboard accessibility (SC 2.1.1, 2.1.2)
* Important images like controls without alt text (1.1.1)
* CAPTCHAs w/o alternative (SC 1.1.1)
* Lack of captions in videos (SC 1.2.2, 1.2.4)
* Really low contrast of text (SC 1.4.3)
* Bad or no visibility of focus (SC 2.4.7)
* Important controls implemented as background image without text
   replacement (SC 1.1.1)
* Important fields (such as search text input) w/o labels (SC 2.4.6)
* lack of structure (e.g. no or inconsistent headings) (SC 1.3.1)
* Self-starting / unstoppable animation, carussels, etc (SC 2.2.1, 2.2.2)

Well, having written this, it may seem a bit arbitrary - but I believe  
the list has many or most of the grave errors that we encounter in  
testing.

If there was a statistic on "show stoppers" things that make sites  
inaccessible or impede access severely, such an approach had a better  
basis, of course...

Just my 2 cents,
Detlev


) that can be tested relatively quickly and without going onto too  
much detail.

I think as long as the method is transparent, / documented and its  
limitations are clearly stated, the results can still be valuable. I  
need to come up with a scoring method
> (preferably a percentage) due to the need to compare a website  
> within those of its own classification (e.g. federal government,  
> corporate, etc), and compare the different classifications.  I am  
> thinking of a method where the website gets a percentage score for  
> each of the POUR principles, and then an overall score.  What I'm  
> strugling with is what scoring method to use and how to put  
> different weights upon different aspects and at different levels.   
> I'll be assessing to WCAG 2.0 AA (as that's the Australian  
> standard).  All input and suggestions are gratefully accepted and  
> may also be useful to our discussions here as it's a real-life  
> situation for me.  It also relates to may of the questions raised in  
> this thread by Shadi.  Looking forward to some interesting discussion.
>
>
> Regards
>
> Vivienne L. Conway
> ________________________________________
> From: public-wai-evaltf-request@w3.org  
> [public-wai-evaltf-request@w3.org] On Behalf Of Shadi Abou-Zahra  
> [shadi@w3.org]
> Sent: Monday, 22 August 2011 7:34 PM
> To: Eval TF
> Subject: some initial questions from the previous thread
>
> Dear Eval TF,
>
>  From the recent thread on the construction of WCAG 2.0 Techniques, here
> are some questions to think about:
>
> * Is the "evaluation methodology" expected to be carried out by one
> person or by a group of more than one persons?
>
> * What is the expected level of expertise (in accessibility, in web
> technologies etc) of persons carrying out an evaluation?
>
> * Is the involvement of people with disabilities a necessary part of
> carrying out an evaluation versus an improvement of the quality?
>
> * Are the individual test results binary (ie pass/fail) or a score
> (discrete value, ratio, etc)?
>
> * How are these test results aggregated into an overall score (plain
> count, weighted count, heuristics, etc)?
>
> * Is it useful to have a "confidence score" for the tests (for example
> depending on the degree of subjectivity or "difficulty")?
>
> * Is it useful to have a "confidence score" for the aggregated result
> (depending on how the evaluation is carried out)?
>
>
> Feel free to chime in if you have particular thoughts on any of these.
>
> Best,
>    Shadi
>
> --
> Shadi Abou-Zahra - http://www.w3.org/People/shadi/
> Activity Lead, W3C/WAI International Program Office
> Evaluation and Repair Tools Working Group (ERT WG)
> Research and Development Working Group (RDWG)
>
> This e-mail is confidential. If you are not the intended recipient  
> you must not disclose or use the information contained within. If  
> you have received it in error please return it to the sender via  
> reply e-mail and delete any record of it from your system. The  
> information contained within is not the opinion of Edith Cowan  
> University in general and the University accepts no liability for  
> the accuracy of the information provided.
>
> CRICOS IPC 00279B
>
>
Received on Tuesday, 23 August 2011 15:04:29 UTC