Untestable success criteria

Submitted by Cynthia Shelley, Wendy Chisholm, Kerstin Goldsmith, and Andi
Snow-Weaver.

The following summarizes our analysis of the October 27th working draft to
identify success criteria that are not testable as currently written.

Untestable Success Criteria:

   Principle 1:
      Guideline 1.5 - Level 2 - Could probably be rewritten to be testable.
      Body text is a structural element, may be too technology specific.
      Not sure what this means for a form.
      Guideline 1.5 - Level 3
         #1. Not testable unless we define the major visual display types.
         But if we do that, then it will be incomplete as soon as a new
         display type is invented.
      Guideline 1.6 - "easily differentiable" is not testable. "legible"
      might be better.
      Guideline 1.6 - Level 2 - If we don't come up with a testing
      algorithm, this is not testable.
      Guideline 1.6 - Level 3
         #1. "easily readable" is not testable. Propose "readable by
         someone with 20/xx vision"
      Guideline 1.7 - Level 2 - determining that background audio content
      is more than 20 db lower than foreground audio content is not
      testable with a single audio stream unless can separate them into
      separate tracks for measuring. Could change to something like "for
      audio content that has multiple tracks, the background sounds are at
      least 20 db lower than the foreground audio content". Explore
      references at http://www.decibel-meter.com/ and
      http://www.lhh.org/noise/decibel.htm

   Principle 2:
      Guideline 2.1 - Level 2 - Need to define "abstract". Should really
      move to a technique because it is so technology dependent.
      Guideline 2.2 - Level 3 - Need to define "competitive activity" and
      "essential"
      Guideline 2.3 - Level 1
         #1a - only testable by designer of site
      Guideline 2.3 - Level 2
         #1. "other content" is too broad. "visibly" is not testable.
         Assumes tool can be developed that can determine if flicker falls
         within the 3 to 49 Hz range.
      Guideline 2.4 - Level 2
         #1. What's a perceived page? What if it's a voice XML application.
         How does it apply to web applications?
         #1a. Does this apply to a page or the site as a whole?
         #1c. What is an alternate display order of a 50-page site?
         #2. Contains ambiguous words: skip, large, repetitive
      Guideline 2.4 - Level 3
         #1. "has been reviewed" is is only testable by someone internal to
         the development organization.
         #2. "logical, linear, reading order" is not testable.
         #4. "logical tab order" is not testable.
      Guideline 2.5 - Level 3
         #1. "where possible" is not testable. This is not applicable in
         all cases. It should be covered in some non-normative document but
         not here.
         #2. "where possible" is not testable.
         #4. "significant" and "important" are not testable.

   Principle 3:
      Guideline 3.2 - Level 2 - "first in standard unabridged dictionary"
      is not testable. Good example is "ATM" - automatic teller machine,
      asynchronous transfer mode, etc. Does "first time" mean first time on
      the page, on a site, in the content management system, etc. ?
      Guideline 3.2 - Level 3
         #1. "can or should be used" is not testable. Define "cascading
         dictionaries". Can test if there is a list of links.
         #2. "has been reviewed" is only testable by someone internal to
         the development organization. "as appropriate" is not testable.
         #2b. "obvious" and "apparent" are not testable.
         #2c. "ambiguous" and "interpretable" are not testable. "unique"
         has to have scope to be testable.
      Guideline 3.3 - Level 2 - "has been reviewed" is only testable by
      someone internal to the development organization.
      Guideline 3.3 - Level 3 - "has been reviewed" is only testable by
      someone internal to the development organization.
      Guideline 3.4 - Level 2
         #1. "key", "generally", and "predictable" are not testable.
         #2. "inconsistent" and "unpredictable" are not testable.
         #3. Might be able to make this testable if we come up with a
         better definition of "extreme changes in context"
      Guideline 3.4 - Level 3
         #2. "has been reviewed" is only testable by someone internal to
         the development organization.

End of untestable success criteria.

During our analysis we determined that we need to establish some
assumptions about the human evaluators. i.e. Do they have to understand
JavaScript? Are they internal to the development organization and have
access to all code including server code? Are there other assumptions?

In addition to identifying the untestable criteria, we came up with the
following issues:

   Guideline 1.1 - Level 2
      #1. Is this really necessary? Do we have an example for this?
      #2. This is very difficult to comply with. We shouldn't require
      something that the video industry doesn't require.
   Guideline 1.2 - Level 2 - is testable but is a pretty hard thing to
   comply with.
   Guideline 1.2 - Level 3 - is testable but shouldn't be a compliance
   criteria.
   Guideline 1.4 - Level 2 success criteria have nothing to do with the
   guideline as worded.
   Guideline 1.5 - Level 3
      #2. Should remove. This is redundant with "separate structure and
      presentation". It's too technology dependent. Assumes the user has a
      browser that supports this function.
   Guideline 1.7 - Level 2 - What about audio tracks over visual
   presentations that might have a lot of background noise because they
   were not recorded in a sound studio? Should we re-write to make it clear
   this is meant to apply only to professionally produced audio?
   Guideline 2.2 - Level 2 - Is it really "moving content"? What about
   speech on a voice XML site?
      #1. Should be removed. It is redundant with #2.
   Guideline 2.3 - Level 2
      #2. Conflicts with Level 2 #1 and is redundant with Level 1 #1b.
   Guideline 2.4 - Level 3
      #3. This is an SVG technique. It should not be a success criteria for
      a guideline.
   Guideline 2.5 - Level 2 - This is testable but is missing some
   information. We need to be clear that this is a user error, not a system
   error and we need to more clearly articulate what type of errors this is
   referring to.
   Guideline 2.5 - Level 3
      #3. This is testable but difficult to do.

Received on Wednesday, 5 November 2003 08:12:28 UTC