- From: Al Gilman <asgilman@iamdigex.net>
- Date: Sun, 09 Dec 2001 22:39:12 -0500
- To: w3c-wai-er-ig@w3.org
At 09:08 PM 2001-12-08 , Wendy A Chisholm wrote: ><http://www.w3.org/WAI/ER/2001/12/05-minutes.html>http://www.w3.org/WAI/ER /2001/12/05-minutes.html > >I tried to summarize the issues and resolutions. Please let me know if there are any corrections or additions. > Thank you for the work you put into the summary. I find it very helpful. Highlights: WCAG: This is an important application scenario, if we can pull it off. The WCAG group would benefit a lot from the right kind of tool support, and it will clarify EARL issues a lot to work in that context. Of course there are too many things to do that would be good to do. There has to be some agreement with the WCAG group on how the process is going to run, to scope out what functions cry out for tool support. I put my first round of response on this item (why I asked "can we have a session on the WCAG application") in a response to Gregg's question about thoughts on test cases on the WCAG (a.k.a. GL) list. <<http://lists.w3.org/Archives/Public/w3c-wai-gl/2001OctDec/0510.html>http: //lists.w3.org/Archives/Public/w3c-wai-gl/2001OctDec/0510.html> The bottom line here is that the job of the experience-gathering is to explore the frontier between good and bad outcomes and fit one or more frontiers defined in terms of readily evaluatable prognostics to that outcomes frontier. We have a starting point -- a candidate collection of prognostics. Now we need to move into empirical mode where we see just how necessary or sufficient [subsets of] these are, and what additional or alternate criteria suggest themselves from the actual pattern of outcomes. So far in the group process, there has been a lot of concern with precision of the criteria -- would different skilled evaluators come away with the same conclusions on being asked to apply the criterion. In the empirical domain, the emphasis shifts somewhat from the precision of the criteria to their accuracy. Do they acually capture what makes a difference in the outcomes as experienced by users without special skills. We know we can't get precision in predicting the latter. There is always the nut behind the wheel. An alternative providing equivalent facilitation may be there right in front of them plain as the nose on your face, and they still won't find it. But this is the bottom line, after all. What makes it work or not work for real people. But there are some questions that we don't have to solve to move ahead on the WCAG support. For example, all the discussion about "at what point a web page has changed enough to invalidate a claim." That should not be an issue for the WCAG experiments. Every change that the experimenters make in a page creates a different test subject, because they are looking for what changes in the page lead to different outcomes and which don't. Even for pages that enter the test set when someone says they are good or bad and the experimeters decide they are interesting, for the purposes of the experiment it makes the most sense to grab a copy and freeze the experimental article just as is and not allow the experiment to be disturbed by changes in what is served at the same location from the original server. In the EO group, last I knew, we had good contacts with a variety of consumer service organizations that could go to work for us recruiting user-testers to evaluate things in bottom line terms. The experimenters and those groups and testers could use help from some automation which can steer people who are willing to participate to where their time is most needed. This is partly mechanical stuff to fill out the distribution of evaluating a given page under diverse situations, and partly manual priority setting for what area of web techniques the experimenters want to focus on at a given time. And then some simple reporting functionality to give the experimenters an overview of the results to date. Al PS:: I also see lots of benefit from instrumentation on the client side that will tell you what actually happened in the episode. This is not just static stuff like what equipment the user is using -- it is dynamic stuff such as event logs and click streams. Being able to play that back or single-step through it would aid an analyst immensely in terms of what wrinkle to try next in the rest of the experiment.
Received on Sunday, 9 December 2001 22:29:11 UTC