- From: Detlev Fischer <fischer@dias.de>
- Date: Wed, 24 Aug 2011 16:18:12 +0200
- To: public-wai-evaltf@w3.org
Am 24.08.2011 15:34, schrieb Boland Jr, Frederick E.: > Some other possible questions: Does an evaluation methodology > necessarily involve a user carrying out a predefined task > involving websites? What exactly are we evaluating against > (how do any business rules, mission definition/completion > requirements, etc. influence an evaluation - "context" of > evaluation)? Do we need any formalisms or ontologies to > adequately express any evaluation parameters/context information? > > Thanks and best wishes > Tim Boland I think for many checks, defining tasks is unnecesssary. If you check for headings, keyboard access, alt texts and most other things, you just investigate all fitting instances on the entire page (and also, complication, in dynamically generated / displayed content). However, some checks of SC, for example in Guidelines 3.2 Predictable and 3.3 Input Assistance, need the definition of tasks / processes that should be documented to be reproducible. Regarding your question "what are we evaluating against" this could be a layer separate from the evaluation itself. Governments or businesses may require Level AA or just Level A or combine either with additional requirements (e.g., usability as in the Dutch Dremperlvrij scheme, http://www.drempelvrij.nl/ ), require corrective action within particular time horizons, etc. > > PS - apologies in advance if these questions have already been answered.. > > > -----Original Message----- > From: public-wai-evaltf-request@w3.org [mailto:public-wai-evaltf-request@w3.org] On Behalf Of Shadi Abou-Zahra > Sent: Monday, August 22, 2011 7:35 AM > To: Eval TF > Subject: some initial questions from the previous thread > > Dear Eval TF, > > From the recent thread on the construction of WCAG 2.0 Techniques, here > are some questions to think about: > > * Is the "evaluation methodology" expected to be carried out by one > person or by a group of more than one persons? If carried out by more than one, the aggregation of separate results may not be made part of the methodology. > > * What is the expected level of expertise (in accessibility, in web > technologies etc) of persons carrying out an evaluation? Hard to say: certainly working knowledge of HTML/CSS and web design and a good knowledge of a11y issues and WCAG. Expert scripting knowledge should, I hope, not be required although it gets harder these days with so much dynamic content being written to pages. Working knowledge of screen readers raises the bar a lot but may be increasingly necessary to test things like the success of WAI-ARIA implementations. > > * Is the involvement of people with disabilities a necessary part of > carrying out an evaluation versus an improvement of the quality? While always beneficial in practical terms, involving users of AT in conformance testing creates the problem that many things work differently across the many combinations of UA and AT versions and also, custom settings of AT. So it will get very hard to manage, and difficult to draw conclusions that are valid for a broad range of implementationms in the field. I feel everything that can be tested with a manageable and free set of browsers and tools should be tested that way. But I realise that more and more things escape such tests and require practical tests with AT. > > * Are the individual test results binary (ie pass/fail) or a score > (discrete value, ratio, etc)? If an individual test result refers to an instance tested on a page I believe that simply doing the sums of all instances per SC or individual test within the SC will often lead to distorted results. Think of 20 teaser images with perfect alt text, one critical linked image (say in the main navigation) without. On the aggregated level of page, I believe a range is necessary. Whether this is percent or discrete steps or whatever seems secondary. > > * How are these test results aggregated into an overall score (plain > count, weighted count, heuristics, etc)? I think weighting for criticality is necessary. > > * Is it useful to have a "confidence score" for the tests (for example > depending on the degree of subjectivity or "difficulty")? I agree with Richard Warren's sentiment that a confidence score would get overly complicated. Our current answer to this is arbitration of two independent results. > > * Is it useful to have a "confidence score" for the aggregated result > (depending on how the evaluation is carried out)? At least it would make sense to flag limitations (e.g., a reduced page sample, or, in our case, the test being conducted by just one tester) > > > Feel free to chime in if you have particular thoughts on any of these. > > Best, > Shadi > -- --------------------------------------------------------------- Detlev Fischer PhD DIAS GmbH - Daten, Informationssysteme und Analysen im Sozialen Geschäftsführung: Thomas Lilienthal, Michael Zapp Telefon: +49-40-43 18 75-25 Mobile: +49-157 7-170 73 84 Fax: +49-40-43 18 75-19 E-Mail: fischer@dias.de Anschrift: Schulterblatt 36, D-20357 Hamburg Amtsgericht Hamburg HRB 58 167 Geschäftsführer: Thomas Lilienthal, Michael Zapp ---------------------------------------------------------------
Received on Wednesday, 24 August 2011 14:18:36 UTC