Re: Untestable success criteria

We've been working on a system to make accessibility guidelines testable and
it may help in your work.

The current draft description of the process is at:
http://aprompt.snow.utoronto.ca/oac/

Our design starts with a list of all the accessibility checks that can
possible be performed. The current list is 130 checks but that will grow.

All accessibility guidelines can then be described as a subset of these
checks.

This is still early in the development cycle and I would appreciate any
feedback you may have.

Our list of accessibility checks is currently outside of WAI but could
possibly be made a WAI effort.

Wendy - Is there anything I can do to help in getting the ER group going
again?

Regards,
Chris


----- Original Message ----- 
From: "Andi Snow-Weaver" <andisnow@us.ibm.com>
To: <w3c-wai-gl@w3.org>
Sent: Wednesday, November 05, 2003 8:12 AM
Subject: Untestable success criteria


> Submitted by Cynthia Shelley, Wendy Chisholm, Kerstin Goldsmith, and Andi
> Snow-Weaver.
>
> The following summarizes our analysis of the October 27th working draft to
> identify success criteria that are not testable as currently written.
>
> Untestable Success Criteria:
>
>    Principle 1:
>       Guideline 1.5 - Level 2 - Could probably be rewritten to be
testable.
>       Body text is a structural element, may be too technology specific.
>       Not sure what this means for a form.
>       Guideline 1.5 - Level 3
>          #1. Not testable unless we define the major visual display types.
>          But if we do that, then it will be incomplete as soon as a new
>          display type is invented.
>       Guideline 1.6 - "easily differentiable" is not testable. "legible"
>       might be better.
>       Guideline 1.6 - Level 2 - If we don't come up with a testing
>       algorithm, this is not testable.
>       Guideline 1.6 - Level 3
>          #1. "easily readable" is not testable. Propose "readable by
>          someone with 20/xx vision"
>       Guideline 1.7 - Level 2 - determining that background audio content
>       is more than 20 db lower than foreground audio content is not
>       testable with a single audio stream unless can separate them into
>       separate tracks for measuring. Could change to something like "for
>       audio content that has multiple tracks, the background sounds are at
>       least 20 db lower than the foreground audio content". Explore
>       references at http://www.decibel-meter.com/ and
>       http://www.lhh.org/noise/decibel.htm
>
>    Principle 2:
>       Guideline 2.1 - Level 2 - Need to define "abstract". Should really
>       move to a technique because it is so technology dependent.
>       Guideline 2.2 - Level 3 - Need to define "competitive activity" and
>       "essential"
>       Guideline 2.3 - Level 1
>          #1a - only testable by designer of site
>       Guideline 2.3 - Level 2
>          #1. "other content" is too broad. "visibly" is not testable.
>          Assumes tool can be developed that can determine if flicker falls
>          within the 3 to 49 Hz range.
>       Guideline 2.4 - Level 2
>          #1. What's a perceived page? What if it's a voice XML
application.
>          How does it apply to web applications?
>          #1a. Does this apply to a page or the site as a whole?
>          #1c. What is an alternate display order of a 50-page site?
>          #2. Contains ambiguous words: skip, large, repetitive
>       Guideline 2.4 - Level 3
>          #1. "has been reviewed" is is only testable by someone internal
to
>          the development organization.
>          #2. "logical, linear, reading order" is not testable.
>          #4. "logical tab order" is not testable.
>       Guideline 2.5 - Level 3
>          #1. "where possible" is not testable. This is not applicable in
>          all cases. It should be covered in some non-normative document
but
>          not here.
>          #2. "where possible" is not testable.
>          #4. "significant" and "important" are not testable.
>
>    Principle 3:
>       Guideline 3.2 - Level 2 - "first in standard unabridged dictionary"
>       is not testable. Good example is "ATM" - automatic teller machine,
>       asynchronous transfer mode, etc. Does "first time" mean first time
on
>       the page, on a site, in the content management system, etc. ?
>       Guideline 3.2 - Level 3
>          #1. "can or should be used" is not testable. Define "cascading
>          dictionaries". Can test if there is a list of links.
>          #2. "has been reviewed" is only testable by someone internal to
>          the development organization. "as appropriate" is not testable.
>          #2b. "obvious" and "apparent" are not testable.
>          #2c. "ambiguous" and "interpretable" are not testable. "unique"
>          has to have scope to be testable.
>       Guideline 3.3 - Level 2 - "has been reviewed" is only testable by
>       someone internal to the development organization.
>       Guideline 3.3 - Level 3 - "has been reviewed" is only testable by
>       someone internal to the development organization.
>       Guideline 3.4 - Level 2
>          #1. "key", "generally", and "predictable" are not testable.
>          #2. "inconsistent" and "unpredictable" are not testable.
>          #3. Might be able to make this testable if we come up with a
>          better definition of "extreme changes in context"
>       Guideline 3.4 - Level 3
>          #2. "has been reviewed" is only testable by someone internal to
>          the development organization.
>
> End of untestable success criteria.
>
> During our analysis we determined that we need to establish some
> assumptions about the human evaluators. i.e. Do they have to understand
> JavaScript? Are they internal to the development organization and have
> access to all code including server code? Are there other assumptions?
>
> In addition to identifying the untestable criteria, we came up with the
> following issues:
>
>    Guideline 1.1 - Level 2
>       #1. Is this really necessary? Do we have an example for this?
>       #2. This is very difficult to comply with. We shouldn't require
>       something that the video industry doesn't require.
>    Guideline 1.2 - Level 2 - is testable but is a pretty hard thing to
>    comply with.
>    Guideline 1.2 - Level 3 - is testable but shouldn't be a compliance
>    criteria.
>    Guideline 1.4 - Level 2 success criteria have nothing to do with the
>    guideline as worded.
>    Guideline 1.5 - Level 3
>       #2. Should remove. This is redundant with "separate structure and
>       presentation". It's too technology dependent. Assumes the user has a
>       browser that supports this function.
>    Guideline 1.7 - Level 2 - What about audio tracks over visual
>    presentations that might have a lot of background noise because they
>    were not recorded in a sound studio? Should we re-write to make it
clear
>    this is meant to apply only to professionally produced audio?
>    Guideline 2.2 - Level 2 - Is it really "moving content"? What about
>    speech on a voice XML site?
>       #1. Should be removed. It is redundant with #2.
>    Guideline 2.3 - Level 2
>       #2. Conflicts with Level 2 #1 and is redundant with Level 1 #1b.
>    Guideline 2.4 - Level 3
>       #3. This is an SVG technique. It should not be a success criteria
for
>       a guideline.
>    Guideline 2.5 - Level 2 - This is testable but is missing some
>    information. We need to be clear that this is a user error, not a
system
>    error and we need to more clearly articulate what type of errors this
is
>    referring to.
>    Guideline 2.5 - Level 3
>       #3. This is testable but difficult to do.
>
>

Received on Wednesday, 19 November 2003 16:43:06 UTC