AW: Comments to Editor's draft 2012-02-09 (an Observer awakes...) from Kerstin Probiesch on 2012-02-23 (public-wai-evaltf@w3.org from February 2012)

From: Kerstin Probiesch <k.probiesch@googlemail.com>
Date: Thu, 23 Feb 2012 09:54:22 +0100
To: "'Michael S Elledge'" <elledge@msu.edu>, 'Loïc Martínez Normand' <loic@fi.upm.es>
Cc: <public-wai-evaltf@w3.org>
Message-ID: <4f45fea2.4c200e0a.1030.528a@mx.google.com>
Hi Loic, Michael, all,

cause of less time today I just pick up one of the above themes:

>> [1.4. Equivalent results] This definition lacks rigour. What is a high
correlation degree? 
>> I am not good at statistics, but some objective threshold can surely be
defined...

> Perhaps a less technical term should be used than "high correlation." 
> I think the intent here is to recognize that there my be more than one way

> to get an answer, but the answer has to be consistent and repeatable.

We discussed this in the very beginning of our work and I'm happy that Loic
brought it up again.

Indeed the term and the description is a bit confusing. 

"A correlation is a single number that describes the degree of relationship
between two variables."
(http://www.socialresearchmethods.net/kb/statcorr.php) Of course what we
mean is in one way or another correlation but I still think that we should
use the term 'reliability': "reliability is the consistency of a set of
measurements or of a measuring instrument, often used to describe a test.
Reliability is inversely related to random error." This would make things
more clear.

And: reliability doesn't imply validity. A measuring instrument can be
consistent but in the same time not valid. Therefore we need the goodness
criteria objective and valid. As mentioned earlier we need some instructions
and appropriate measures for 'guaranteeing' objectivity. The third goodness
criteria 'validity' is, I think, important for our discussion about testing
techniques, which I think is not valid against WCAG2 because of the
character of techniques.

Best

--Kerstin 


Von: Michael S Elledge [mailto:elledge@msu.edu] 
Gesendet: Mittwoch, 22. Februar 2012 22:56
An: Loïc Martínez Normand
Cc: public-wai-evaltf@w3.org
Betreff: Re: Comments to Editor's draft 2012-02-09 (an Observer awakes...)

Hi Loic--

Thanks for all the thoughtful input. I had a couple of questions--please see
below. I'd encourage the rest of the group to look over Loic's comments on
sampling and the evaluation process--Loic's comments show how helpful it is
to have a fresh set of eyes look things over!

Mike Elledge

On 2/20/2012 6:16 PM, Loïc Martínez Normand wrote: 
Dear all, 

Let me first introduce myself. My name is Loïc Martínez and I teach at the
Technical University of Madrid (Spain). I've been researching in the field
of accessibility since 1995 and I am president of the Sidar Foundation (that
is represented in the EVAL-TF by Emmanuelle Gutierrez). I also actively
participate in standardization activities in the field of ICT accessibiltiy
in Spain (AENOR), Europe (CEN, ETSI) and Internationally (ISO and ISO/IEC).

I was invited by Shadi to actively participate in EVAL-TF but I was unable
to commit the required amount of hours, so I have been a (very) quiet
observer since the beginning of your work.

Last week I was finally able to spend some time on EVAL-TF issues when
travelling to the WAI-ACT open meeting and I have reviewed the latest editor
draft of  the Website Accessibility Evaluation Methodology for WCAG 2.0.

I have decided to split my comments into three emails to facilitate
threading in the mailing list. In this first email I will post some general
and editorial comments. In two following emails I will post my views on
sampling (chapter 4) and the evaluation process (chapter 5). 

I sincerely hope that my comments will be useful in your future work.

General comments
• [Abstract] Different contexts should also include summative (i.e. at he
end of the,process, such as conformity assessment) and formative (I.e.
during development, like usability texting) evaluations. For conformity
assessment, only two results are possible (pass, no pass). In formative
evaluation other values are possible, such as accessibility metrics. Thus,
WCAG-EM should cover both types of results.
I think we've come to the conclusion that our emphasis is on a methodology
that will lead to conformity assessment, even if we're evaluating a sub-part
of a website.

• [1.4. Equivalent results] This definition lacks rigour. What is a high
correlation degree? I am not good at statistics, but some objective
threshold can surely be defined...
Perhaps a less technical term should be used than "high correlation." I
think the intent here is to recognize that there my be more than one way to
get an answer, but the answer has to be consistent and repeatable.

• [2.1] If the methodology proposes to use review teams, then it should
provide guidance on how to perform evaluation by teams: how to split the
evaluation, how to combine the results of several evaluators, how to grade
evaluators performance...
You make a good point; are your concerns answered by the "Using Combined
Expertise" article?

• [2.2] When persons with disability evaluate web sites, they are not able
to evaluate all success criteria. For instance, a blind person cannot
evaluate colour contrast. Thus, the methodology should provide guidance
about which portion of WCAG can  one person evaluate depending on
disability...
Great suggestion. We should also mention that not all evaluation tools are
fully accessible, which will further affect the participation of persons
with disabilities. 

• [3. Paragraph 1] One critical aspect of the scope of the evaluation is the
concept of "accessibility supported". The decision of what is accessibility
supported should be part of the scope of the evaluation and affects the
sampling and the evaluation process.
Can you explain this some more? I'm not sure I understand what you mean by
"accessibility supported."

• [3. Paragraph 1] This methodology should be much more specific to be
useful. Concerning scope, the methodology should mandate particular forms of
defining the scope of the evaluation.. It is the only way to facilitate
interchange of results.
I believe we are going to circle back to this and include examples...

• [3.1] Currently this paragraph is confusing. I think that it should
explain what is a complete process and that in many cases some steps of a
complete process can be out of control for the website owner (I.e. payment
subsystems). Because of that in some cases the evaluators could chose not to
consider full processes.
I'm not sure I agree. I think processes should be evaluated in their
entirety, even if a portion of the process is outside the control of the
owner. Since any claims of conformance must take into account the entire
process shouldn't the evaluation?

Editorial comments
• [1.4 Web page] Some consistency is needed. Is it "web page" or "webpage"?
Is it "web site" or "website"? Please unify.
Yes.

• [3. Paragraph 2. Word "Office"] Why uppercase? Are you thinking about a
particular office application or suite?
This should probably be "Microsoft Office."

• [4. Paragraph 1. Last sentence] Editorial comment. This last sentence is
confusing and needs rewriting.
The term "resource" is confusing and something we've talked about replacing
which I think will help.

Best regards,
Loïc
Received on Thursday, 23 February 2012 08:54:26 UTC