- From: Shadi Abou-Zahra <shadi@w3.org>
- Date: Wed, 19 Jun 2013 17:40:31 +0200
- To: Peter Korn <peter.korn@oracle.com>
- CC: Eval TF <public-wai-evaltf@w3.org>, Detlev Fischer <detlev.fischer@testkreis.de>
Hi Peter, all, In short, here is what I understood in conclusion of this exchange: - Possibly there is a 4th use-case (regression evaluation) - Further tweaks to "confirmation evaluation" might be necessary - Probably further lighter edits and tweaks needed throughout too More detailed comments inline below (all for discussion, of course): On 14.6.2013 19:05, Peter Korn wrote: > Hi Shadi, > > Thank you for moving this discussion forward. More comments in-line below. > >> ... >> We discussed these three use cases. Here is an attempted write-up for these >> use cases, for discussion: >> >> >> - Initial Evaluation: Typically carried out when organizations first start out >> with accessibility and want to learn about how well their website conforms to >> WCAG 2.0 in order to improve it. It is expected that the website will likely >> not conform, and the main purpose of the evaluation is to identify the types >> of barriers on the website, and possibly to highlight some of the possible >> repairs, so that these can be addressed in future developments. > > For me, I would rather see this as a "Development Evaluation". Something > undertaken when the site owner (or web application owner) expects things aren't > fully accessible yet, and is interested in understanding the extent to which > work will need to be done. Often (or at least hopefully!) such evaluations will > be undertaken part-way through the development process, before the > site/application is generally available and while there is still significant > time left in the development process to make significant changes (e.g. to > choices of UI component sets, templates, etc.). > > Report output would likely be more technical in such a circumstance I think, and > detailed lists of bugs with information on how to reproduce them will be of > significant importance. Maybe we are talking about two somewhat different use cases here? I mean an initial evaluation very early on in a (typically redesign) process. There would certainly be a bug list but the focus is more on educating the readers of the report on what *type* of issues there are rather than to list out the individual bugs. What you are suggesting seems similar to what I describe as "periodic evaluation", but you are redefining that too. More further below... >> - Periodic Evaluation: Typically carried out periodically to monitor how well >> conformance to WCAG 2.0 was maintained, or progress towards conformance to >> WCAG 2.0 was achieved during a given period. The main purpose of such >> evaluations is comparability of the results between iterations. In some cases >> particular areas of the website may have changed, or the entire website may >> have been redesigned between one evaluation and the next, and evaluators will >> need to consider these changes during the sampling and reporting stages of the >> evaluation. > > For me, I see this more as "Regression Evaluation". Something undertaken both > to monitor how accessibility is improving (or regressing), as well as to measure > the results of an improvement program. > > Report output may be more in summary form, giving a broad measure of the level > of improvement/regression, and perhaps discussing that by area or type (e.g. > "image tagging has broadly improved, with only ~5% of images missing ALT text > vs. ~20% 6 months ago, within our tested sample of pages"). > > This might also be used by a development organization, (e.g. when a product goes > through various development stages: alpha, beta, etc.), though I would expect in > those cases they might simply run another "Development Evaluation", since they > will still be focused - at least from the reporting point of view - on the > detailed issues found. Middle/senior management may prefer a summary. OK, I think this is a new use case that I have not directly considered. It is like the "periodic evaluation" with more summaries. I wonder how this impacts the evaluation process as a whole versus the reporting? >> - Confirmation Evaluation: Typically carried out to confirm a claim made by >> the vendor or supplier, where a website is assumed to meet particular >> conformance targets in relation to WCAG 2.0. The main purpose of such >> evaluation is to validate a conformance claim with reasonable confidence, or >> to identify potential mismatches between the conformance claim and the >> website. Such evaluations are often re-run while the vendor or supplier >> addresses confirmed issues. The intervals are typically shorter than for >> Periodic Evaluations and are also more focused towards the issues previously >> identified. > > The title "Confirmation Evaluation" suggests this is evaluation is NOT made by > the owner of the site/application, which I think is a mistake. I would hope the > same steps an owner might take to evaluate the accessibility of their > site/application is the same as what a customer/user might do (or a consumer > organization). Some may use it to confirm a vendor's claim, but others may use > it to assure themselves that their development organization did what was > expected, or a gov't agency may seek this from a contractor who did work for > them (and then do their own mini-spot-check). OK, we can discuss the title. But I think also the website owner may want to confirm the claim made by a supplier/vendor. > Also, I am REALLY UNCOMFORTABLE with the word "conformance claim" in your > characterization Shadi. Unless every page of the entire site (and every > possible UI permutation in a web app) has been thoroughly examined, I don't see > how an entity can properly make a "conformance claim" for an entire site/ web > app. I think instead we need a new word/phrase here, and should be talking > about confidence levels around the extent to which all WCAG 2.0 SCs (at > A/AA/AAA) have been met. Can you be more specific about which parts make you uncomfortable? I specifically tried to clarify the scope in the very first sentence: [[ a claim made by the vendor or supplier, where a website is assumed to meet particular conformance targets in relation to WCAG 2.0 ]] >> I think these now reflect both the timing as well as indicate a little bit >> more on the typical "depth" of an evaluation. We'll probably also need to >> explain that there are many variances of these typical cases depending on the >> website, context, etc. It is a spectrum, really. > > Fully agree with this! OK good. Thanks, Shadi > Peter > >> >> Comments and feedback welcome. >> >> Best, >> Shadi >> >> >> On 6.6.2013 16:10, Detlev Fischer wrote: >>> Hi, >>> >>> just some quick input in case you do cover my proposal to modify "Goal of the >>> Evaluation" today. >>> >>> I get that #3 In Depth analysis report is close to what I would call "design >>> support test" (or "development support test") since you usually conduict it >>> when you *know* that the site will not conform - tyhe aim is to identify all >>> the issues that nieed to be addressed before a conformance evluation has a >>> chance to be successful. >>> >>> Since it usually comes first, I find it odd that it is mentioned last, and >>> that no hint is given that this is usually an evaluation where the aim is >>> *not* a conformance evaluation (because you already know that there will be a >>> number of issues that fail SCs). >>> >>> The on thing lacking in goal #3 is the requirement to cover all SCs acros the >>> sample of pages (with or without detail) and by doing so, providing a >>> benchmark for the degree of conformance already reached - even if it is >>> necessarilz a crude one. >>> >>> So there are 2 things that are missing in the three types of goals we have now: >>> >>> (1) a clear indication (in the name of the report type) that there is one >>> evaluation that does *not* aim for measuring conformance but happens in >>> preparation of a final test, with the aim to uneath problems; >>> (2) the ability in this tpe of test to provide a metric of success across all >>> SCs for the pages in the sample that can be compared to a later conformance >>> evaluation of the same site. >>> >>> Sorry, I would have loved to participate today but my voice isn't up to it... >>> >>> Best, >>> Detlev >>> On 5 Jun 2013, at 16:34, Velleman, Eric wrote: >>> >>>> Hi Detlev, >>>> >>>> tend to look at the more detailed explanation of the three types of Reports >>>> in Step 5.a [1]: >>>> >>>> 1. Basic Report >>>> 2. Detailed Report >>>> 3. In-Depth Analysis Report >>>> >>>> For me the difference between #2 and #3 is in the level of detail that is >>>> required in the Report. #2 is more on the page level, and #3 is more on the >>>> website level: >>>> >>>> #3 is a way of reporting that does not require you to name every failure on >>>> every page. The evaluator is asked to give a certain amount of examples of >>>> the occurrence of the failures on the website (not every page like in the >>>> detailed report). This makes #2 better for statistics and research. >>>> >>>> Does this make sense? >>>> >>>> Eric >>>> >>>> >>>> [1] http://www.w3.org/TR/WCAG-EM/#step5 >>>> ________________________________________ >>>> Van: Detlev Fischer [detlev.fischer@testkreis.de] >>>> Verzonden: donderdag 30 mei 2013 17:15 >>>> Aan: public-wai-evaltf@w3c.org >>>> Onderwerp: Step 1.b Goal of the Evaluation - Design support evaluation vs. >>>> conformance evaluation? >>>> >>>> Hi everyone, >>>> as promised in the telco, here is a thought on the current section "Goal of >>>> the Evaluation". >>>> >>>> Currently we have: >>>> 1. Basic Report >>>> 2. Detailed Report >>>> 3. In-Depth Analysis Report >>>> >>>> For me, 2 and 3 have always looked a bit similar as there is no clear line >>>> between specifiying issues on pages and giving advice as to improvements >>>> (often, you cannot not easily specify remedies in detail because as testers >>>> we are often not familiar with the details of the development environment). >>>> >>>> In the discussion it struck me that we seemed to have a (largely?) shared >>>> notion that our evaluation work usually falls into one of 2 categories: >>>> >>>> 1. Design support evaluation: Take an (often unfinished) new design and find >>>> as many issues as you can to help designers address & correct them (often in >>>> preparation for a future conformance evaluation/ conformance claim) >>>> 2: Conformance evaluation: Check the finished site to see if it actually >>>> meets the success criteria (this may take the form of laying the grounds for >>>> a conformance claim, or challenging a conformance claim if a site is >>>> evaluated independently, say, by some organisation wanting to put an >>>> offender on the spot). >>>> >>>> Most of our work falls into one of these two categories, and you won't be >>>> surprised that we sell design support tests (one tester) as preparation for >>>> final tests (in our case, two independent testers). (And I should mention >>>> that our testing scheme currently does not follow the 100% pass-or-fail >>>> conformance approach.) >>>> >>>> There is actually a third use case, which is checking old sites known to >>>> have issues *before* an organisation starts with a re-design - so they see >>>> the scope of problems the re-design will need to address (and also be aware >>>> that there may be areas which they *cannot* easily address and determine how >>>> to deal with those areas). >>>> >>>> Sorry again to raise this point somewhat belatedly. Hope this will trigger a >>>> useful discussion. >>>> Best, >>>> Detlev >>>> >>>> >>>> -- >>>> Detlev Fischer >>>> testkreis c/o feld.wald.wiese >>>> Thedestr. 2, 22767 Hamburg >>>> >>>> Mobil +49 (0)1577 170 73 84 >>>> Tel +49 (0)40 439 10 68-3 >>>> Fax +49 (0)40 439 10 68-5 >>>> >>>> http://www.testkreis.de >>>> Beratung, Tests und Schulungen für barrierefreie Websites >>>> >>>> >>>> >>>> >>>> >>> >> > > -- > Oracle <http://www.oracle.com> > Peter Korn | Accessibility Principal > Phone: +1 650 5069522 <tel:+1%20650%205069522> > 500 Oracle Parkway | Redwood City, CA 94064 > Green Oracle <http://www.oracle.com/commitment> Oracle is committed to > developing practices and products that help protect the environment > -- Shadi Abou-Zahra - http://www.w3.org/People/shadi/ Activity Lead, W3C/WAI International Program Office Evaluation and Repair Tools Working Group (ERT WG) Research and Development Working Group (RDWG)
Received on Wednesday, 19 June 2013 15:41:02 UTC