Re: Summary of discussion on TSD TF review process

Hi,

Thanks for the initial summary Christophe and for comprehensive comments 
Carlos. Please find some more responses from me:


Carlos Iglesias wrote:
>> * Step 3 says: "a task force participant receives the 
>> assignment to carry out an initial review". Who does the 
>> assignment? I here refer back to an earlier discussion (I 
>> don't have a reference) where we said that people take test 
>> samples in batches of e.g. 5 and review them.
> 
> I think this problem appears before, at step 2: "a task 
> force participant receives the assignment to carry out a 
> review of the structure of a test sample." Additionally (if 
> steps 3 and 4 are not merged), should the TF participant of 
> steps 2 and 3 be the same just to keep things simple?

The first question is how are assignments assigned. For a previous Task 
Force (the "Evaluation Tools Task Force" [1]), we had a simple list [2]. 
TF participants volunteered to work on specific tools and their names 
were listed beside these tools. They were responsible to carry out some 
reviews of these tools and submit their feedback to the group. This 
worked pretty well and I think we should try this approach. Just a 
simple volunteering system where people keep themselves busy by signing 
up to do reviews.

[1] <http://www.w3.org/WAI/ER/2005/tools-tf>
[2] <http://www.w3.org/WAI/ER/2005/tools>


The second questions is if the same TF participant should be doing the 
reviews of step 2 and step 3. The answer is ideally yes but it is not a 
must. For example, we may have a delay in reviewing the *content* of 
test samples and may therefore want to queue up test samples that are 
structurally ready. So structure review and content review should be 
considered as separate processes to make things simple for now.


>> * The above response leads to the question how we record 
>> which test samples are assigned to whom. We could do that by 
>> means of an additional element in TCDL, but that may give a 
>> misleading impression to outsiders (one person in charge of 
>> the whole review process?), even if we make that element 
>> optional, hide it from the web view and remove it at the end 
>> of the process.
> 
> IMO this information shouldn't be part of the Test Samples 
> Metadata because it's just "administrative information" (it's 
> not in the submitted Test Samples) without much interest once 
> the Test Sample gets it's "final" status (although it could 
> be relevant information to record for a proper follow-up of 
> the process)

OK, let's keep a separate list connecting test sample with TF 
participants who are responsible for reviewing them. I'll work on that 
list, probably an XML document in the shared CVS space so that we can 
all edit it.


>> * In addition to who reviews a test sample in steps 2 and 3, 
>> we need to record other metadata such as review comments, 
>> proposals to accept or reject, and possibly metrics about the 
>> extent to which a test sample meets the criteria in the checklists.
>> We could just send things to the mailing list, but then the 
>> data may become hard to keep track off.
>> We could also use a Wiki (like the WCAG WG), for example with 
>> a table where rows represent test sample and columns 
>> represent TF participants (who's been assigned what), 
>> contributor of the test case, review comments, links to 
>> strawpoll results, etcetera.
>> If metrics are really important, a database seems more useful 
>> (but also less flexible than a Wiki).
> 
> I'm in favor of using something more structured than just the mailing 
> list (Bugzilla, Wiki...)

Do we want to keep a history of the review comments? For example when a 
test sample is changed or otherwise resubmitted, it is re-reviewed. Do 
we modify the initial comment or create a new comment and keep the old 
one? What other requirements do we have for the tool(s) we will use?


>> * In step 4 (Online Strawpoll), should "Checklist for 
>> Structure Reviews" be "Checklist for Content Reviews"?
> 
> Also think so.

Yes, this is a typo. Fixing...


>> * If we use WBS forms for strawpolls, is it reasonable to 
>> expect that every task force participants answers the 
>> strawpoll, and do strawpolls have time limits? In WCAG WG, 
>> strawpolls time out a few hours before the teleconference to 
>> give the chairs sufficient time to prepare for the 
>> teleconference. We could use the same approach in the task force. 
>> We could also define a "quorum" for the strawpolls and decide 
>> to reopen a strawpolls if the number of responders is too 
>> low. (This proposal sounded good to people in the teleconference.)
> 
> I'm in favor of this approach, but think we should keep an eye on 
> the definition of "quorum" if we want to guarantee a good P2P 
> review (i.e. if CTIC has proposed Test Cases, they should be 
> reviewed by a minimum TF participants outside of CTIC and so on)

I strongly agree with a timeout, and agree with keeping an eye on the 
"quorum". The number and the affiliation of the responders should be 
considered. I propose this be primarily the responsibility of our dear 
TF facilitators, but we all have the duty to help keep an eye too.


>> * Steps 3 and 4 are the same except for who does the review. 
>> Can these steps be merged?
> 
> IMO if the person who carry out the initial review at step 2 
> can't get the Test Sample back to a previous state if any problem 
> is found (e.g. rejected?) then it makes no sense to have an 
> initial individual review separate from the group review.
 >
 > Additionally, what is supposed to happen if we find any
 > problems while checking the content. Shouldn't be any output
 > options there (steps 4) more than pending? (e.g rejected again?)

Accepting or rejecting a test sample shouldn't be the call of an 
individual, especially since we are aware that there is a subjective 
factor to the review (see the comment below). This is why this is only 
an output of step 5, after all the group has had the chance to review 
the test sample and to participate in the voting.

Step 3 is supposed to be a preparatory step to facilitate the review by 
the group. It is much easier to review a test sample in step 4 based on 
an initial review. The comments from step 3 can help point out specific 
things to look at for the convenience of the other TF participants. 
However, step 4 (an online strawpoll with a timeout) can only start 
after step 3 has been accomplished.

I hope this clarifies why these are separate steps, let me know if there 
needs to be further clarification.


>> * We should test the review process with a real test sample. 
>> That would help us see, for example, if all the criteria in 
>> the checklists are clear and unambiguous (e.g. "files are 
>> valid in their use" and "no unintentional broken links").

Yes, we should get started and test the process. We already have three 
test samples uploaded by Vangelis and Daniela. We need three volunteers 
to do reviews...


> I find the Structure Checklist pretty clear. In opposition, 
> maybe could be hard to agree on unambiguous criteria for 
> Content Checklist (e.g. minimal and complete, unambiguous 
> unit, etc.)

Yes, the content review may have some subjective aspects, or at least 
criteria that are hard to define. This is why we have the collective 
review in step 4, and a collective decision in step 5. We should also 
continue working on refining the checklist as we go along and learn from 
our review activities.


Regards,
   Shadi


-- 
Shadi Abou-Zahra     Web Accessibility Specialist for Europe |
Chair & Staff Contact for the Evaluation and Repair Tools WG |
World Wide Web Consortium (W3C)           http://www.w3.org/ |
Web Accessibility Initiative (WAI),   http://www.w3.org/WAI/ |
WAI-TIES Project,                http://www.w3.org/WAI/TIES/ |
Evaluation and Repair Tools WG,    http://www.w3.org/WAI/ER/ |
2004, Route des Lucioles - 06560,  Sophia-Antipolis - France |
Voice: +33(0)4 92 38 50 64          Fax: +33(0)4 92 38 78 22 |

Received on Thursday, 30 November 2006 01:25:54 UTC