Re: acceptance criteria for new success criteria

Hi all, 

When writing more atomic tests for WCAG 2.0 over the years, I have found the following ISO documents to be very useful as they've provide me with some steer on creating unambiguous + verifiable tests:

- ISO/IEC Guide 7:1994: Guidelines for drafting of standards suitable for use for conformity assessment; and
- ISO/IEC Directives Part 2: Principles to structure and draft documents intended to become International Standards, Technical Specifications or Publicly Available Specifications.

With regard to “high inter-rater reliability”, I also like the approach taken in ISO with regard to consensus e.g. “General agreement, characterized by the absence of sustained opposition to substantial issues".

If instead of saying 8/10 experts should agree (or something similar), I'm just wondering if another approach might be to introduce a mechanism by which you can try to determine where there are uncertainties in evaluation findings - and then ask why these uncertainties exist; and what (if anything) can be done to reduce these uncertainties.

Hope this helps.

Very best regards

Alistair 

Alistair Garrison

On 1 Jun 2016, at 10:36, josh@interaccess.ie wrote:

> Hi all,
>  
> Rather than start a new thread, I will add to this. On yesterdays call, we discussed what should be the requirements for creating new SCs.
>  
> The current list (with many thanks to Lisa/David Mc D for their input) is:
>  
> Ensure that requirements may be applied across technologies such as HTML, CSS, SVG etc.
> All success criteria will be mapped to the function requirements it aims to meet.
> Ensure that the conformance requirements are clear and testable (where possible) [Note: we will need language around areas that are not explicitly testable - or where subjective expert evaluation is required]
> Utilize the WCAG 2.0 A/AA/AAA structure.
> Success criteria need to be as broad as possible without becoming a 'catch-all' for any given requirement.
> Candidate success criteria will be peer reviewed and if too great in scope will be broken into more granular requirements.
> They must be testable.
> They must be applicable to specific technologies.
> They must always be true.
> They are statements of 'what is'- when the statement of is true - then you have met the SC.
>  
> Please feel free to add/comment etc. We are also interested in the 'institutional memory' available for those who are long-timers (lifers?) in the group. So we shall send out a survey soon to gather input.
>  
> Thanks
>  
> Josh
>  
> ------ Original Message ------
> From: "Katie Haritos-Shea" <ryladog@gmail.com>
> To: "White, Jason J" <jjwhite@ets.org>
> Cc: "Patrick Lauke" <redux@splintered.co.uk>; "WCAG" <w3c-wai-gl@w3.org>
> Sent: 01/06/2016 10:25:34
> Subject: RE: acceptance criteria for new success criteria
>  
>> Andrew said...
>> "The term used during the development of WCAG 2.0 was “high inter-rater reliability”. I don’t recall our discussion of exactly what the requirements were, but my general recollection is that it entailed likely agreement by most reasonably informed evaluators (not the same as agreement by most “experts”, which, to my mind, is a lower standard that is easier to meet)."
>> 
>> I recall the “high inter-rater reliability” metric to be something like: if after testing content for a SC that 8 out of 10 experienced evaluators agreed....
>> 
>> 
>> Katie Haritos-Shea
>> 703-371-5545
>> 
>> On May 31, 2016 1:56 PM, "White, Jason J" <jjwhite@ets.org> wrote:
>> 
>> 
>> > -----Original Message-----
>> > From: Patrick H. Lauke [mailto:redux@splintered.co.uk]
>> > On 31/05/2016 16:36, Detlev Fischer wrote:
>> > [...]
>> > > It is nice to believe that consensus is obtainable but even among
>> > > testers working to the same set of checkpoints with detailed rating
>> > > instructions, we frequently experience disagreement - mostly because
>> > > the issue context or the mapping of issues to SCs makes it hard to
>> > > agree on a fair rating.
>> >
>> > I'd like to wholeheartedly +1 Detlev's comment here.
>> 
>> The main strategy with which WCAG 2.0 seeks to address this problem is to encourage interpreters to consider the purpose of each requirement, not just its text, and to evaluate with the purpose in mind wherever there is ambiguity.
>> 
>> In technical standards, test suites and interoperability testing are a common technique for solving the problem, but they're of limited applicability here by reason of the many requirements that cannot be automatically evaluated.
>> 
>> A common legal approach to the problem of interpretation is to develop a case-law, i.e., normative decisions on specific interpretive questions that arise from concrete situations. This would work here (given a suitable group of experts and a good decision-making process), but the W3C isn't set up for it - the most that can be done is to provide non-normative guidance and to release a revised specification. To some extent, the techniques constitute interpretive material, though they are not normative and to that extent not authoritative.
>> 
>> Another possibility for a future version would be to have normative techniques as well as the general principles, guidelines and success criteria. Where the techniques apply, they are normative; where they don't apply, the more general standard has to be applied directly. This solution also has precedent in the way in which some legislative schemes are set up. Under this approach, the technology-specific guidance would be normative wherever it is applicable, but the general guidelines are also available to accommodate situations not encompassed by the specific requirements.
>> 
>> The main problem is that the techniques evolve too quickly for regulators, organizational policy setters and other parties who need to use the Guidelines; and there is a risk of losing the general principles amid all of the details, as well as the risk of encouraging more "legalistic" methods of interpretation that don't look to the broader purpose of making the Web more accessible but instead focus on the minutia of conformance and the language of normative statements. Also, if techniques were normative, the scrutiny that they would receive and the amount of work involved in bringing them to publication would likely increase substantially.
>> 
>> Before deciding among these and other solutions, we need to know how large the problem of interpretive disagreement is in WCAG 2 as it currently stands, then move the discussion into the planning process for preparing the next major version (not version 2.1, but the next significant revision).
>> 
>> 
>> ________________________________
>> 
>> This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited.
>> 
>> 
>> Thank you for your compliance.
>> 
>> ________________________________

Received on Wednesday, 1 June 2016 11:23:37 UTC