Re: Evaluation process for test samples: first proposal from Chris Ridpath on 2006-11-21 (public-wai-ert-tsdtf@w3.org from November 2006)

From: Chris Ridpath <chris.ridpath@utoronto.ca>
Date: Tue, 21 Nov 2006 08:58:51 -0500
To: "cstrobbe" <Christophe.Strobbe@esat.kuleuven.be>, <public-wai-ert-tsdtf@w3.org>
Message-ID: <013b01c70d75$2cae65b0$e29a968e@WILDDOG>
The process you've created looks good to me. Thanks for creating it.

In step 4:
"The task force reviews the test sample and the evaluation and decides 
whether to accept or reject the test case."
How will the decision be made?

One option is to create a straw poll web page where people can accept/reject 
the test sample. They can also attach comments to the decision. This process 
is used by the WCAG working group.

Another option is to make the decisions on the weekly call and just record 
the decisions in the minutes.

I favour the straw poll web page because it gives people a chance to vote 
even if they can't attend the calls. It's also a quick way of 
accepting/rejecting many samples and we will likely have a large number to 
go through.

Cheers,
Chris


----- Original Message ----- 
From: "cstrobbe" <Christophe.Strobbe@esat.kuleuven.be>
To: <public-wai-ert-tsdtf@w3.org>
Sent: Monday, November 20, 2006 11:05 AM
Subject: Evaluation process for test samples: first proposal


>
> Hi,
>
> I had an action item to propose an evaluation process
> for test cases. Below is a proposal based, on input from Shadi.
>
> * 1: A test sample is uploaded to the CVS repository.
>     Status is set to "unconfirmed" (if we use the terminology from
>     the Conformance Test Process For WCAG 2.0 [1]) or something
>     similar (e.g. "unreviewed").
>     Test samples with this status are queued for review by
>     the task force.
>
> * 2: A task force member pre-reviews the test sample.
>     This review includes:
>     - confirming that all the necessary files are available;
>     - confirming that all the necessary files are valid [2];
>     - proofreading the title, description and other text in
>       the metadata;
>     - making sure the links and the date are correct;
>     - making sure that the location pointers are consistent
>       with each other [3];
>     - checking that file names and the ID in the metadata follow
>       our naming convention;
>     - checking that the 'rule' ID actually exists in rulesets.xml;
>     - check that the referenced technique or failure is really
>       a technique of failure for the referenced 'rule' ID;
>     - anything else I missed?
>     If the test sample passes this "administrative" check,
>     its status is set to "new" (as in [1]) or "in review"
>     (if we choose other terms)
>     and queued for the next step in the process.
>     If the test sample does not pass this check, its status is set
>     to "pending bugfix" (or something similar) until it passes
>     all the above checks. To fix these bugs, it can either
>     be sent back to the submitter or, if the fix is obvious,
>     it can be fixed by a tast force member.
>
> * 3: The test sample goes to a second review, possibly
>     (preferably?) by the same person who did the pre-review.
>     This review is a content review where the reviewer
>     evaluates how well the test sample addresses the technique.
>     During this review, the test procedure in the referenced
>     technique is also reviewed "to ensure that [it is]
>     unambiguous, easy to read by humans and easy to implement
>     in software" [4].
>     If the reviewer finds no issues with the test procedure,
>     he/she proposes to accept or reject the test sample.
>     If the reviewer finds an issue with the test procedure,
>     he/she proposes an alternative procedure and proposes
>     to accept or rejct the test sample based on
>     this new procedure.
>     These comments and evaluations are recorded somewhere public.
>     For the status, we could use value such as
>     "accepted pending TF decision".
>
> * 4: The task force reviews the test sample and the evaluation
>     and decides whether to accept or reject the test case.
>     If the test sample is accepted, the status becomes
>     "accepted by task force" or "pending WCAG WG decision" (or ...).
>     This means that the test sample is ready from the perspective
>     of the task force but needs review by the WCAG WG for a
>     final decision.
>     If the test sample is rejected, the status changes to
>     "pending bugfix" (or "unconfirmed"?). The reviewer must then
>     contact the submitter and provide a rationale  for the
>     rejection. The submitter can refine and resubmit the
>     test sample; it then goes through the same process
>     again, starting at step 2.
>
> * 5: The WCAG WG reviews the test sample and accepts it or
>     sends it back to the task force, possibly with comments.
>     If the test sample is accepted, the status is changed to
>     "accepted" and it does not need to be reviewed again until
>     the WCAG WG publishes a new draft.
>     If the test sample is rejected, it is sent back to the task
>     force and the status changes to "pending bugfix"
>     (or "unconfirmed"?); it then goes through the same
>     process again, starting at step 2.
>
>
> The above description focuses on the entry and exit conditions
> in each step in the process, so I have left out a few details,
> for example, that we review test samples in batches and that
> the task force decides on acceptance during a teleconference.
> I have also left out how we may send our work to the WCAG WG,
> for example through the mailing list or a questionnaire.
> (Questionnaires can have time outs, which may be handy.)
>
>
> [1] http://www.w3.org/WAI/GL/WCAG20/tests/ctprocess.html
> [2] Valid in the context of the test sample, so for example
> the technique may require an invalid HTML document but the
> metadata etc must still be complete and valid.
> [3] All pointers withing the same 'location' element point
> to the same location in the test sample.
> [4] WCAG 2.0 Test Samples Development Task Force (TSD TF)
> Work Statement: http://www.w3.org/WAI/ER/2006/tests/tests-tf
>
> Best regards,
>
> Christophe
>
> -- 
> Christophe Strobbe
> K.U.Leuven - Departement of Electrical Engineering - Research Group on
> Document Architectures
> Kasteelpark Arenberg 10 - 3001 Leuven-Heverlee - BELGIUM
> tel: +32 16 32 85 51
> http://www.docarch.be/
>
> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
>
>
Received on Tuesday, 21 November 2006 13:59:42 UTC