- From: Kerstin Probiesch <k.probiesch@gmail.com>
- Date: Sat, 1 Mar 2014 11:29:29 +0100
- To: <public-wai-evaltf@w3.org>
Dear Eval TF members, first I want to say sorry for the one-day-delay. I was very busy in the last weeks. I hope the Task Force will be so kind to however consider my comments. My comments are guided from the fact that WCAG-EM is more than "just" a paper for further debate and discussion about Accessibility Metrics. When we just look at WCAG-EM from the W3C-Perspective WCAG-EM is informative and a Working Group Note. For European Countries WCAG-EM is even more. WCAG-EM is part of WAI-Act, a European Commisssion project and WCAG-EM is mentioned in the document "Accessibility requirements for public procurement of ICT products and services in Europe" on the following pages: 90, 92, 94: "Where it is not manageable to check every Web page that is provided to the user, then an appropriate methodology can be used to assess the overall conformance of the web content. A methodology for evaluating the conformance of websites to WCAG 2.0 is under development by W3C and is available at: http://www.w3.org/TR/WCAG-EM/" Even while it is written that WCAG-EM is "a methodology" and not "the methodoloy" it is likely that WCAG-EM will be under European Conditions more than just informative especially in the context of Public Procurements. In general I like the sampling section. Thanks. Thanks also for all the work the Task Force did far. I believe that a methodology like WCAG-EM has the potential to bring in more harmonization in evaluating websites. Now my comments on W3C Working Draft 30 January 2014. ## Website Accessibility Conformance Evaluation Methodology (WCAG-EM) 1.0 - W3C Working Draft 30 January 2014 I appreciate that it is called 1.0 cause it indicates that the work is not finished yet. In the same time the "1.0" gives the methodology a formal character. When we look at other Working Group Notes cases with "1.0" are rare. Therefore the "1.0" could likely be misunderstood as Recommendation. Suggestion: I would prefer something like "Alpha Version". ## Purposes for this Methodology This version: "This methodology is designed for anyone who wants to follow a common procedure for evaluating the conformance of websites to WCAG 2.0." Comment: "common procedure" is not defined in the document it is therefore a bit unclear to me what the term "common" means in this context. Because WCAG-EM is a new methodology and because there was not a test run until now the term is misleading. Suggestion: Delete "common" and write: "This methodology is designed for anyone who is evaluating websites." ## Review Teams (Optional) This Version: "However, using the combined expertise of review teams provides broader coverage of the required skills and helps identify accessibility barriers more effectively. While not required for using this methodology, the use of review teams is recommended when performing an evaluation of a website". At first sight it sounds logically that a team is better than an individual tester. Let's have a second look: We don't have any studies which are suitable to either support or refute the thesis that a review teams are _always_ (which is indicated by the sentence) working more effectively. Wether a review team will find more and is therefore effectively depends on various factors. I expect that a review team where each member of the team has long time experience in evaluating accessibiltiy may find more. And I believe that a single tester with a lot experience may find more than a review team with less experience. A review team is also not acting independent from other external factors like budget, time and so on. We don't have any studies on this issue therefore it is a thesis but the wording indicates to me that it is a proven fact. The referenced document explains this issue and gives additional information but provides also no proven data about this and is therefore, sorry, a bit self-referencial. Suggestion: Make clear that it is a thesis. ## Involving Users (Optional) This Version: "Involving people with disabilities including people with aging-related impairments helps identify additional accessibility barriers that are not easily discovered by expert evaluation alone." When we read the section "Review Teams" and the section "Involving Users" in relation one can get the impression that "Users" is a strictly different thing than "experts ". If two evaluators are evaluating a website and one of the evaluators is disabled and the other one not is it then a Review Team or an individual evaluator who has involved a disabled user? Suggestion: Give more explanation and definitions about expert evaluation, involving users and review teams. ## Step 1b This Version: "Note: It is often useful to evaluate beyond the conformance target ". First Suggestion: Change"often" to "always". Especially when just level A is achieved by the website owner it don't take much time for evaluators to check SCs like 1.4.3 (Contrast Minimum), 1.4.5 (Images of Text), 2.4.5 (Multiple Ways), 2.4.7 (Focus visible) and comment them in the report. Just an example: An evaluator checks SCs like "Keyboard" and "No Keyboard Trap" the Evaluator will see if the Focus is visible or not. There are cases thinkable where Level-AA-SCs are met, but one or two Level-A-SCs are not met in full. A correction of probably just some A-SCs could than lead to AA-Conformance even when just A was achieved. This I believe is intended by note 1 of "Conformance Level": "Note 1: Although conformance can only be achieved at the stated levels, authors are encouraged to report (in their claim) any progress toward meeting success criteria from all levels beyond the achieved level of conformance." Second suggestion: Of course it would take much more time (and budget) if only A is achieved by the website owner and an evaluator checks also AAA-SCs. My suggestion is therefore that the SCs of the next highest level are always useful to evaluate. Especially when only A is achieved. (I hope this makes sense). ## Step 1d This Version: "W3C/WAI provides a set of publicly documented (non-normative) Techniques for WCAG 2.0 that help evaluate conformance to WCAG 2.0 Success Criteria." I think there is still a disparity betweetn WCAG-EM and the reference (Understanding techniques), because the Understanding Document says: "The tests are only for a technique, they are not tests for conformance to WCAG success criteria." After the above cited sentence of WCAG-EM says: "Some evaluators might use other methods". This sentence implies that "most" evaluators are using the WCAG-Techniques. Despite of wether this is fact or not it could be understood in a way that if "just some evaluators" might use other methods it is better to use the Techniques, because "most" are using them. Despite from all the efforts made especially on this section in comparison to older versions of WCAG-EM I feel it is still a bit misleading. I want to give an example: When an evaluator is checking a PDF-File relying on the PDF-Techniques is not sufficient and very limited. I'm missing also a reference to "What would be the negative consequences of allowing only W3C's published techniques to be used for conformance to WCAG 2.0?" (http://www.w3.org/WAI/WCAG20/wcag2faq.html#techsnot). Suggestion: Add a link to the document "What would be the negative consequences...." and add a negative example for relying just on the techniques document while evaluating web content. Delete "Some evaluators might use other methods" and write "You are also free to use other methods". ## Step 1e Typo: evalaute -> evaluate ## Step 3 This Version: "In cases where it is feasible to evaluate all web pages, this sampling procedure can be skipped and the selected sample is considered to be the entire website in the remaining steps of the conformance evaluation procedure." "feasible" is a critical term. Wether something is feasible or not depends on several variables (time, money, the amount of single pages are just some aspects which decides wether something is "feasible" or not) Therefore "feasibility" is also not one of the major quality criteria for tests in general. Suggestion: I believe that this pararaph and especially the term "feasible" needs further discussion. I am aware that something like "feasible" is needed - especially when it comes to very huge websites. In the same time I have a problem with this term because it is a specific term in the area of test theory. Probably it would be enough to give guidance for operationalisation and: I am not comfortable that I don't have a better suggestion in the moment of time. Please add also "(which is highly recommended)" after "evaluate all web pages". ## Step 5d This version: "While aggregated scores provide a numerical indicator to help communicate progress over time, there is currently no single widely recognized metric that reflects the required reliability, accuracy, and practicality." Because there is no scoring system which fulfill quality criteria like reliability (I'm missing also objectivity and validity) I think that 5d should not even be an optional step. In general I think that scores are misleading and I believe this is also true for the suggested aggregated score in WCAG-EM. A numerical value "X points" raises automatically the question: "What the meaning of "X points"? If all is met except captions for videos the score is very high. One might think that because of the good score it is also a good accessibility, which would not be true for hard of hearing and deaf people. Especially in the above mentioned european context a scoring system as part of WCAG-EM should also be underpinned by proven data. Suggestion: Drop the whole "Scoring Section" except the sentence: "While aggregated scores provide a numerical indicator to help communicate progress over time, there is currently no single widely recognized metric that reflects the required reliability, accuracy, and practicality." ## Missing Issues WCAG-EM says nothing about the rating system: pass/fail or "anything" else? This, I think, is very critical because the results of different rating systems will not not be comparable. Suggestion: Provide pass/fail as rating system for evaluation. Kind regards and again thanks a lot for all the work and time. Kerstin Probiesch -------------------------------------------------------------------- Kerstin Probiesch - Freie Beraterin Barrierefreiheit, Social Media, Projektleitung Kantstraße 10/19 | 35039 Marburg Tel.: 06421 167002 E-Mail: mail@barrierefreie-informationskultur.de Web: http://www.barrierefreie-informationskultur.de XING: http://www.xing.com/profile/Kerstin_Probiesch
Received on Saturday, 1 March 2014 10:29:53 UTC