Re: Requirements draft from Denis Boudreau on 2011-09-12 (public-wai-evaltf@w3.org from September 2011)

From: Denis Boudreau <dboudreau@accessibiliteweb.com>
Date: Mon, 12 Sep 2011 15:49:10 -0400
To: Eval TF <public-wai-evaltf@w3.org>
Message-id: <D7FE864A-A8AB-46D4-A95F-53C2F9E505FC@accessibiliteweb.com>
Good morning everyone,

Here's my take on the whole thing.


On 2011-09-12, at 4:42 AM, Kerstin Probiesch wrote:

>> * Requirements:
>>> R01: Technical conformance to existing Web Accessibility Initiative (WAI) Recommendations and Techniques documents.
> 
>> Comment (RW) :  I do not think we need the word technical. We should stick with WCAG as agreed when we discussed *A01.  The recommendations and techniques are not relevant here as our priority is the Guidelines. It is possible for someone to comply with a particular guideline without using any of the recommended techniques. What we are after is methodology.  I therefore suggest a suitable alternative as follows:
>> *R01 Define methods for evaluating compliance with the accessibility guidelines (WCAG)
> 
> Comment (KP): As I understood R01 it stresses the formal level. If the formulation would be "R01: Technical conformance to existing Web Accessibility Initiative (WAI) Recommendations and Techniques" I would agree. Because we have in the WCAG sub-documents like "understanding", "glossary" and so on. For that "documents" for me is ok. Because of other WAI documents like e.g. ATAG I would agree with
> As long as the formal level of the documents itself and not the techniques which are in the documents is meant.

Comment (DB): I believe we need to stay on a macro level, as we're talking general methodology here. We'll have plenty of time to delve right in eventually. Right now , our main focus should be compliance with the WCAG as a whole, not to each and every techniques that may or may not exist at the time of this writing. Just to build up on Richard's proposal, I would therefore suggest: 

*R01 Defining methods for evaluating WCAG 2.0 compliance

WCAG will most already have been defined in this document, so there's no need to repeat it each and every time.



>>> R02: Tool and browser independent
> 
>> Comment (RW) : The principle is good but sometimes it may be necessary to use a particular tool such as a text-only browser. So I would prefer :
>> *R02: Where possible the evaluation process should be tool and browser independent.
> 
> Comment (KP): I partly agree with "possible". When we use "possible" we should then describe/define what "possible" exactly means.

Comment (DB): Right. That works for me too. But I'd rather keep them short. So I'd vote for:

*R02: Tool and browser independent (where possible)



>>> R03: Unique interpretation
> 
>> Comment (RW) : I think this means that it should be unambiguous, that means it  is not open to different interpretations. I am pretty sure that the W3C has a standard clause it uses to cover this point when building standards etc. Hopefully Shadi can find it <Grin> . This also implies use of standard terminology which we should be looking at as soon as possible so that terms like “atomic testing” do not creep into our procedures without clear /agreed definitions.
> 
> Comment (KP): Using standard terminology is an important point also for me. And I suggest that we should also regard the standard terminology used I testing theory. The advantage would be that we are using established terms which will help to avoid misunderstandings. 

Comment (DB): Using standard terminology is of outmost importance to me as well. However, I personally do not believe in a single interpretation for any success criteria. And I certainly do not believe in the possibility of everyone under the same interpretation. What we should focus on is achieved results (compliance), not how people actually got there (technique used). There are multiple ways to interpret the guidelines and our methodology should reflect this. Instead of striving for "unique interpretation" I would much rather go for "agreed interpretation", even if this means actually building a document where we would document what those divergent interpretations mean. So, I suggest going with: 

*R03: Agreed interpretations



>>> R04: Replicability: different Web accessibility evaluators who perform the same tests on the same site should get the same results within a given tolerance.
> 
>> Comment (RW) : The first part is good, but I am not happy with introducing “tolerance” at this stage. I think we should be clear that we are after consistent, replicable tests. I think we should add separate requirement later for such things as “partial compliance” and “tolerance. See R14 below.
>> 
>> *R04: Replicability: different Web accessibility evaluators who perform the same tests on the same site should get the same results.
> 
> Comment (KP): I strongly agree with Richard. Except "Replicability" and would suggest:
> 
> R04: Reliability: different Web accessibility evaluators who perform the same tests on the same site should get the same results.

Comment (DB): As long as we take into consideration that there can be different ways/tools to run those tests, then yes, reliability and replicability are important. Getting to different results usually means evaluators do not interpret the rules the same way. This dos not always mean that one is wrong and the other is right. So again, to keep those short, I would simply go with

*R04: Reliable and replicable

The explanation that follows could then reflect the idea that different evaluators performing the same tests on the same site should get the same results.



>>> R05: Translatable
> 
>> Comment (RW) : As in translatable into different languages – Yes - agree
> 
> Comment (KP): I agree and I see especially translatable in the context of using standard terminology which would be helpful for translating.  

Comment (DB): +1.

*R05: Translatable



>>> R06: The methodology points to the existing tests in the techniques documents and does not reproduce them.
> 
> Comment (KP): I agree.
> 
>> Comment (RW) : yes – but I would like it a bit clearer that it is WCAG techniques.  I would also like the option to introduce a new technique if it becomes available. So I suggest 
>> *R06 Where possible the methodology should point to existing tests and techniques in the WCAG documentation.

Comments (DB): I agree with the general idea here as well, but it needs to be shorter. We can aways reflect the intention in the description that follows.

*R06 Pointing to existing tests and techniques (where possible).



>>> R07: Support for both manual and automated evaluation.
> 
>> Comment (RW) :  Not all Guidelines can be tested automatically and it is not viable to test some others manually. This needs to be clearer that the most appropriate methods will be used, whether manual or automatic. Where both options are available they must deliver the same result. 
>> 
>> *R07:  Use the most appropriate manual or automatic evaluation. Where either could be used then both must deliver the same result.
> 
> Comment (KP): I see "support" as just support and the important point "deliver the same result" in the context of R04 "Replicability" or as I suggest "Reliability".

Comments (DB): I agree with the general idea here as well, but again, it needs to be shorter. We can aways reflect the importance of using the most appropriate approach in the document itself.

*R07: Reliable evaluation support (manual or automated).



>>> R08: Users include (see target audience)
> 
>> Comment (RW) : Whilst user testing is essential  for confirming accessibility it is not needed/essential for checking compliance with WCAG. If we feel that user testing is needed then we must specify what users, what skill level, what tasks etc..so that evaluators all use the same type of user and get the same type of result. I would prefer not to include users here as a requirement.
> 
> Comment (KP): A tricky R. - especially in the context of the above mentioned "It is possible for someone to comply with a particular guideline without using any of the recommended techniques." The question would be: How a tester can find out if an SC is met when the recommended techniques are not used? Wouldn't that mean that a tester needs deep knowledge in using for example Screenreaders as well as Magnifiers and ... We discussed this also in an another mail thread. I prefer to include users here but we have to describe what users according to Richards consideration in the above paragraph.

Comments (DB): Testing with "real uses" should be encouraged, but in no way should it be made mandatory. The only requirement should be to run tests with a skilled screen reader user, following a specific evaluation methodology. All the better if this evaluator happens to be a real user. So:

*R08: Users include (see target audience)



>>> R09: Support for different contexts (i.e. self-assessment, third-party evaluation of small or larger websites).
> 
>> Comment (RW) :  Agreed.
> Comment (KP): Agree

Comments (DB): . +1.



>>> R10: Includes recommendations for sampling web pages and for expressing the scope of a conformance claim
> 
>> Comment (RW) : I agree. This is probably going to be the most difficult issue, but it is essential if our methodology is going to be useable in the real world as illustrated by discussions already taking place. Should it include tolerance metrics (R14)?
> 
> Comment (KP): I also think it’s the most difficult issue. Because of the ongoing discussion about different approaches I want to abstain for the moment.

Comments (DB): While I seem to be a little more optimistic than you two, it is an important issue. I wish that we can draw from everybody's experience and come up with something new and improved, compared to our respective approaches.

*R10: Web pages sampling recommendations.

We can aways reflect the importance of expressing the scope of conformance claim in the document itself.



>>> R11: Describes critical path analyses,
>> Comment (RW) :  I assume this is the CPA of the evaluation process (ie define website, test this, test that, write report etc.). In which case agreed

> Comment (KP): I'm not sure what is meant by this R. Because of that no vote from me now.

Comments (DB): Agreed as well. This is something we never officially did ourselves at AccessibiltéWeb, but it does look like a great idea. 

*R11: Describes critical path analyses.



>>> R12: Covers computer assisted content selection and manual content selection
> 
>> Comment (RW) : I do not know what this means – can Eric explain ?
> Comment (KP): I also don't have a exactly idea what this R. could mean.

Comments (DB): Isn't this directly related to page sampling determination and critical paths analyses? I get the manual content selection part, but I can't understand how this could be computer generated in any way... right now, I don't see why this couldn't just be a part of R11.



>>> R13: Includes integration and aggregation of the evaluation results and related conformance statements.
> 
>> Comment (RW) : I think this means “write a nice report” in which case I agree.
> Comment (KP): I agree.

Comments (DB): Lol, reports are crucial indeed and every report should be technically-biaised, with a good executive summary for the faint-hearted. But this R is definitely too complicated as is.

*R13: Evaluation reports and related conformance statements.



>>> R14: Includes tolerance metrics.
> 
>> Comment (RW) : Agreed – but maybe combine with R10
> Comment (KP): The tolerance metrics will depend on the testing procedure itself. Because of that for me I'm happy with that and suggest not to combine with any other R.

Comments (DB): I can see why it could be integrated with R10, but don't really mind if it's not. I think the wording is appropriate.

*R14: Includes tolerance metrics.




>>> R15: The Methodology includes recommendations for harmonized (machine-readable) reporting.
> 
>> Comment (RW) : I am not sure that methodologies recommend things. Do you mean
>> 
>> *R15: Reports must be machine readable.
> 
> Comment (KP): As I understood R15 this means e.g. structures in documents but also recommendations for the content structure. If so, I agree with R15.

Comments (DB): Shouldn't this be a part of R13 as well?

*R15: Recommendations for harmonized (machine-readable) reporting

Best regards,

/Denis
Received on Monday, 12 September 2011 19:49:46 UTC