WCAG support scenario

At 09:08 PM 2001-12-08 , Wendy A Chisholm wrote:
><http://www.w3.org/WAI/ER/2001/12/05-minutes.html>http://www.w3.org/WAI/ER
/2001/12/05-minutes.html
>
>I tried to summarize the issues and resolutions.  Please let me know if there
are any corrections or additions. 
>

Thank you for the work you put into the summary.  I find it very helpful.

Highlights:

WCAG:  This is an important application scenario, if we can pull it off.  The
WCAG group would benefit a lot from the right kind of tool support, and it
will
clarify EARL issues a lot to work in that context.

Of course there are too many things to do that would be good to do.  There has
to be some agreement with the WCAG group on how the process is going to
run, to
scope out what functions cry out for tool support.

I put my first round of response on this item (why I asked "can we have a
session on the WCAG application") in a response to Gregg's question about
thoughts on test cases on the WCAG (a.k.a. GL) list.  

<<http://lists.w3.org/Archives/Public/w3c-wai-gl/2001OctDec/0510.html>http:
//lists.w3.org/Archives/Public/w3c-wai-gl/2001OctDec/0510.html>

The bottom line here is that the job of the experience-gathering is to explore
the frontier between good and bad outcomes and fit one or more frontiers
defined in terms of readily evaluatable prognostics to that outcomes frontier.

We have a starting point -- a candidate collection of prognostics.  Now we
need
to move into empirical mode where we see just how necessary or sufficient
[subsets of] these are, and what additional or alternate criteria suggest
themselves from the actual pattern of outcomes.

So far in the group process, there has been a lot of concern with precision of
the criteria -- would different skilled evaluators come away with the same
conclusions on being asked to apply the criterion.  In the empirical domain,
the emphasis shifts somewhat from the precision of the criteria to their
accuracy.  Do they acually capture what makes a difference in the outcomes as
experienced by users without special skills. 

We know we can't get precision in predicting the latter.  There is always the
nut behind the wheel.  An alternative providing equivalent facilitation may be
there right in front of them plain as the nose on your face, and they still
won't find it.  But this is the bottom line, after all.  What makes it work or
not work for real people.

But there are some questions that we don't have to solve to move ahead on the
WCAG support.  For example, all the discussion about "at what point a web page
has changed enough to invalidate a claim."  That should not be an issue for
the
WCAG experiments.  Every change that the experimenters make in a page
creates a
different test subject, because they are looking for what changes in the page
lead to different outcomes and which don't.  Even for pages that enter the
test
set when someone says they are good or bad and the experimeters decide they
are
interesting, for the purposes of the experiment it makes the most sense to
grab
a copy and freeze the experimental article just as is and not allow the
experiment to be disturbed by changes in what is served at the same location
from the original server.

In the EO group, last I knew, we had good contacts with a variety of consumer
service organizations that could go to work for us recruiting user-testers to
evaluate things in bottom line terms.  The experimenters and those groups and
testers could use help from some automation which can steer people who are
willing to participate to where their time is most needed.  This is partly
mechanical stuff to fill out the distribution of evaluating a given page under
diverse situations, and partly manual priority setting for what area of web
techniques the experimenters want to focus on at a given time.  And then some
simple reporting functionality to give the experimenters an overview of the
results to date.

Al

PS::  I also see lots of benefit from instrumentation on the client side that
will tell you what actually happened in the episode.  This is not just static
stuff like what equipment the user is using -- it is dynamic stuff such as
event logs and click streams.  Being able to play that back or single-step
through it would aid an analyst immensely in terms of what wrinkle to try next
in the rest of the experiment. 

Received on Sunday, 9 December 2001 22:29:11 UTC