RE: Comment on Requirements RO3, R04, proposal to include Validity from Velleman, Eric on 2011-09-29 (public-wai-evaltf@w3.org from September 2011)

From: Velleman, Eric <evelleman@bartimeus.nl>
Date: Thu, 29 Sep 2011 10:51:26 +0000
To: Detlev Fischer <fischer@dias.de>, "public-wai-evaltf@w3.org"<public-wai-evaltf@w3.org>
Message-ID: <3D063CE533923349B1B52F26312B0A4671A566@s107ma.bart.local>
Hello Detlev, all,

Please find my comments marked with EV:
Eric Velleman

R03: Unambiguous interpretation

The current draft text is:

"The Methodology itself is unambiguous to people who want to use it. It should make it clear to users what they can do if they choose a certain evaluation approach in the document."

DF: The second sentence sounds a bit lame. 'Evaluation approach' sounds like a synonym for the Methodology we are developing, so "a certain Evaluation approach" sounds like one of many, possible approaches. This makes the Methodology sound like a broad-church kind of thing: you could follow this approach, or that, or something else. That is probably not what is intended.
The methodology should be prescriptive even if it leaves the details (what tools, what sequence of checks) to evaluation tool implementations. 
Another bit of nitpicking: "In the document" also sounds somewhat loose - what document? Is this a synonym for the entire methodology or will the methodology contain several documents? Or (unlikely), is it the document to be evaluated?

EV: Agree. Propose to take the unclear part away. In my opinion probably the first part is clear enough. Leaving: "The Methodology itself is unambiguous to people who want to use it." This also covers your just comment about the word "document". 

DF: And to hark back to the term "unambiguous" (sorry, I can't resist to insist): I see a fundamental conflict between keeping the methodology, acc. to R02, "Tool and browser independent" and free of concrete interpretation advice, and the goal of achieving "unambiguopus interpretation". I agree that resources that will help achieve a valid interpretation can resist outside the methodology itself (e.g., a set of cases with appropriate judgements/ ratings per SC). But whether an interpretation will be unambiguous will depend on the quality of this external set. In my view, in cannot be safeguarded within a methodology that just refers to outside resouces not in its scope and outside its control. So I still maintain that R03 is ultimately more than our methodology could hope to stem. Therefore, I think it should not appear as requirement in this form.

EV: I would like to imagine that unambiguous interpretation is about the Methodology and not about applying tests from WCAG 2.0. These tests are indeed in a dynamic external resource. So it could happen that evaluators come to totally different results when applying the tests from WCAG 2.0 to a website meaning that the test is not unambiguous. This is not nice, but in my opinion, not in the scope of the Methodology. As a Task Force, we could have two (or more) approaches to that: 1. We contact WCAG WG and ask them to have a close look at the test that seems to be multi-interpretable or might even be missing. 2. The methodology can add methods to cover this like cross-checking etc. 3. Other.. 

R04: Replicable

The current draft text is:

"Different Web accessibility evaluators using the same methods on the website(s) should get the same results."

The comment to R03 above similarly applies to R04: Replicable. I agree it should be our *aim to strive for replicability* (like a vanishing point) but I believe it will never be achieved in the literal sense.

We had the term 'Reliable' at some point next to 'Replicable'. 'Reliable' is less deterministic and seems a lot more suited. A test can be reliable within tolerances (see R14). To claim Replicability and allow for tolerances at the same time seems disingenuous to me.

Proposal: Change to

R03: Valid
R04: Reliable

Validity is perhaps the most important requirement and is it not yet stated. There is "R17: Support validity" but this just expresses that tests need to be documented so they can be cross-checked.

EV: Agree with adding Valid and Reliable. We could use the standard requirements for scientific research: 
1. systematic (also in the reporting), 
2. valid (measures what it claims to measure and measures what real users would need), 
3. reliable (using a sound scale/checklist, so other evaluators achieve the same results, this includes replicable), 4. feasible (this is more of a practical argument). 
We could place them together at the start of the list. Now they are in different places.

DF: I think R16 and R17 are actually quite similar and could be folded into

EV: In fact they are part of the list of four requirements above.

R16: Support independent verification and quality assurance

*Validity* is necessary as an independent requirement somewhere near the top of the list. Without achieving it (to a degree, see tolerances), the methodology would be pointless.

Validity ultimately needs to be grounded in testing with real users of assistive technologies. That does *not* mean that actual AT tests have to carried out in each application of the methogology, but that the set of resources guiding interpretation needs to be consistent with AT user experience (which also implies that it needs to be updated as
technologies become better supported by UA / AT), so there is a dynamic element.

This is what I tried to convey in the last teleconference, but talking too much as usual, there wasn't much of this point left in the scribe's notes (NOT the fault of the scribe, just too much talk)

EV: Important input and discussion! Thanks. In the Methodology we will have to cover this AT part. This might be a bigger problem than covering differences in interpretation of guidelines, success criteria and tests

Hope you have a fruitful discussion today!
Detlev

EV: Thanks


Am 29.09.2011 09:09, schrieb Shadi Abou-Zahra:
> Dear All,
>
> Please find the latest draft of the requirements from Eric:
> - <http://www.w3.org/WAI/ER/conformance/ED-methodology-20110928>
>
> Best,
> Shadi
>
>
> On 28.9.2011 21:39, Velleman, Eric wrote:
>> Dear Eval TF,
>>
>> The next teleconference is scheduled for Thursday 29 September 2011 at:
>> * 14:00 to 15:00 UTC
>> * 16:00 to 17:00 Central European Time
>> * 10:00 to 11:00 North American Eastern Time (ET)
>> * 07:00 to 08:00 North American Pacific Time (PT)
>> * 22:00 to 23:00 Western Australia Time
>>
>> Please check the World Clock Meeting Planner to find out the precise
>> date for your own time zone:
>> -<http://www.timeanddate.com/worldclock/meeting.html>
>>
>> The teleconference information is: (Passcode 3825 - "EVAL")
>> * +1.617.761.6200
>> * SIP / VoIP -http://www.w3.org/2006/tools/wiki/Zakim-SIP
>>
>> We also use IRC to support the meeting: (http://irc.w3.org)
>> * IRC server: irc.w3.org
>> * port: 6665
>> * channel: #eval
>>
>>
>> AGENDA:
>>
>> #1. Welcome
>>
>> #2. Requirements There is a new version available at:
>> <http://www.w3.org/WAI/ER/conformance/ED-methodology-20110924.html>
>> We will discuss this version during this meeting. A lot of changes
>> following our discussion last week and on the list. And some changes
>> proposed by Shadi to make things more clear.
>>
>> #3. Resources related to our Methodology
>> It would be interesting to gather resources about Web Evaluation
>> Methodologies. We made a good start thanks to Tim and . What to do
>> with this information? Shadi proposes to add information to the
>> Benchmarking Web Accessibility Metrics Wiki
>> at:<http://www.w3.org/WAI/RD/wiki/Benchmarking_Web_Accessibility_Metrics>
>>
>> #4. Any other business
>>
>> Regards,
>>
>> Eric
>>
>
Received on Thursday, 29 September 2011 10:52:01 UTC