AW: Evaluator Errors from Kerstin Probiesch on 2012-01-20 (public-wai-evaltf@w3.org from January 2012)

From: Kerstin Probiesch <k.probiesch@googlemail.com>
Date: Fri, 20 Jan 2012 09:53:02 +0100
To: <public-wai-evaltf@w3.org>
Message-ID: <4f192b50.84310e0a.3f7d.0f0d@mx.google.com>
Hi Vivienne, all,

is the paper somewhere in the web?

I really think that we should discuss errors in detail, independent of if we
decide for 100% conformance or not. In the moment we are discussing just
"errors/findings on pages", but other errors can have an impact on the
result. If an evaluator makes a mistake or oversees something, it is a
mistake of the evaluator. This should not happen, of course, but it can
happen, nobody is perfect. The evaluator might mix-up accessibility and
usability. This also can produce an error.

But this is not the main point I'm concerned of. Other aspects can produce
errors. Some can happen in the preliminary phase, where an evaluator selects
the pages. Let's take 10 pages, just as an example: 10 pages on a small
website is something else than 10 on a huge website. On a very small website
which probably has just 10 single pages there will be no sampling error. The
more single pages a website has the more likely sampling errors are. 

Just one example and as kind of intellectual play, I think I mentioned it in
one of my mails in the very beginning of our work: An evaluator has to check
the conformance of the websites of different political parties. Every
evaluator has of course his own political opinion. What could happen, I
don't say that it will happen, is that an evaluator is sampling the pages of
the party he belongs to. Two errors are possible: probably the evaluator
will not go so deeply but probably the opposite might happen, because the
evaluator knows that there is a correlation between the page sampling and
the result. 

The less pages a sample has and the more singles pages the tested website,
the more it is likely that errors like this can happen. Probably the
evaluator likes the layout, probably not. And so on. Not all of these
aspects must be conscious to the evaluator. We should give guidance to
prevent them and of course reduce the possibility of errors like this in the
methodology itself. I think we need a list and descriptions of possible
evaluator errors and their possible impact on the result, independent if
those are highly probable, more likely or unlikely - because we don't know
how likely they really are. And of course and to be sure: I don't want to
suppose those issues.

This is also just one reason why I prefer not only relying on pages.
Probably we need a sample of X pages, probably not. If we decide for page
sampling there must be, I think, further testing of elements as integral
part of the methodology: X tables, X forms and of course SC 1.3.3 is
important. Especially violations of 1.3.3 can likely be overseen in the
preliminary phase of the sampling. I think this testing of elements are not
just important for the result of the conformance claim itself but also for
the prevention of sampling errors and will bring in more reliability.

And just for taking over an issue of the other thread: I belong to the '100%
conformance group' - not because of naivety, not because of black/white
thinking. Because of reducing errors and holding the margin of error in
whole as a conglomerate of different small or not so small errors which can
arise during the whole evaluation process as low as possible.

Best

Kerstin


> -----Ursprüngliche Nachricht-----
> Von: Vivienne CONWAY [mailto:v.conway@ecu.edu.au]
> Gesendet: Freitag, 20. Januar 2012 08:12
> An: Velleman, Eric; Detlev Fischer; Kerstin Probiesch
> Cc: public-wai-evaltf@w3.org
> Betreff: Evaluator Errors
> 
> HI all
> 
> Giorgio Brajnik wrote a paper that dealt with the issue of evaluator
> errors. He found that 3 expert evaluators putting their results
> together could find all the errors, and that it would take 14 novices
> to come to the same conclusion.  So, if only one evaluator is looking
> at the site, there are bound to be some omissions.  Again, it all comes
> down to how much money the website owner is willing to part with to get
> their site evaluated.  It takes much more money for 3 people to
> evaluate a website than 1.
> 
> 
> Regards
> 
> Vivienne L. Conway, B.IT(Hons), MACS CT
> PhD Candidate & Sessional Lecturer, Edith Cowan University, Perth, W.A.
> Director, Web Key IT Pty Ltd.
> v.conway@ecu.edu.au
> v.conway@webkeyit.com
> Mob: 0415 383 673
> 
> This email is confidential and intended only for the use of the
> individual or entity named above. If you are not the intended
> recipient, you are notified that any dissemination, distribution or
> copying of this email is strictly prohibited. If you have received this
> email in error, please notify me immediately by return email or
> telephone and destroy the original message.
> ________________________________________
> From: Velleman, Eric [evelleman@bartimeus.nl]
> Sent: Friday, 20 January 2012 6:44 AM
> To: Detlev Fischer; Kerstin Probiesch
> Cc: public-wai-evaltf@w3.org
> Subject: RE: AW: Discussion 5.5
> 
> Hi Kerstin, Detlev,
> 
> This is also an interesting margin of error.
> The evaluator making mistakes.
> This is an interesting thing to look at when we talk about
> replicability. This would indicate that there is a margin of error from
> the evaluators that influences replicability depending on the size of
> the sample. Wiki indicates that it decreases with a larger sample.
> 
> Do we accept errors by evaluators?
> Kindest regards,
> 
> Eric
> 
> ________________________________________
> Van: Detlev Fischer [fischer@dias.de]
> Verzonden: donderdag 19 januari 2012 22:29
> Aan: Kerstin Probiesch
> CC: public-wai-evaltf@w3.org
> Onderwerp: Re: AW: Discussion 5.5
> 
> Hi Kerstin,
> 
> Whoops, I may have been on the wrong track. I guess what you refer to
> describes uncertainty in *attestation*: evaluator's errors, omissions,
> or misjudgements, not error = the pin-downable flaws that we find in
> evaluating web sites. So maybe 'error' is best used exclusively as a
> term to describe variance in the evaluation process? But then, wasn't
> the term "margin of error" used in the context of marginal flaws that
> might be acknowledged without preventing the attestation of
> conformance? Not sure anymore, it's too late - must go back to the
> discussion...
> 
> Regards,
> Detlev
> 
> 
> 
> it just describes  Quoting Kerstin Probiesch
> <k.probiesch@googlemail.com>:
> 
> > Hi Detlev,
> >
> > "error margin" or "margin of error" is a term used in Test
> Development. Some
> > hints here:
> > http://www.linguee.com/english-
> german?query=margin+of+error&source=english.
> > Some further explanations here:
> > http://en.wikipedia.org/wiki/Margin_of_error.
> >
> > Regs
> >
> > Kerstin
> >
> >> -----Ursprüngliche Nachricht-----
> >> Von: Detlev Fischer [mailto:fischer@dias.de]
> >> Gesendet: Donnerstag, 19. Januar 2012 21:16
> >> An: public-wai-evaltf@w3.org
> >> Betreff: RE: Discussion 5.5
> >>
> >> Hi Tim,
> >>
> >> English isn't my first language, but doesn't 'error' indicate that
> >> someone basically knew how to do something and erred (not always the
> >> case with the problems we encounter)? Maybe 'flaw' is the more
> >> accurate term? 'Failure instance' sounds pretty stilted and may
> easily
> >> get mixed up with (WCAG) Failures.
> >>
> >> Mhm..(scratching head)...mhm.
> >>
> >> Quoting "Boland Jr, Frederick E." <frederick.boland@nist.gov>:
> >>
> >> > According to some references I recently accessed, criticality
> >> > implies that the evaluation cannot continue until the problem has
> >> > been resolved, whereas non-criticality implies that the evaluation
> >> > may proceed with the problem noted.
> >> >
> >> > A definition of "error" (from
> >> > http://dictionary.reference.com/browse/error?s=t
> >> > ) "a deviation from accuracy or correctness"
> >> > -which would seem to apply to "barrier" as well?
> >> >
> >> > A definition of "barrier" (from
> >> > http://dictionary.reference.com/browse/barrier?s=t
> >> > ) "anything built or serving to bar passage"
> >> > -which would seem to imply criticality as mentioned previously
> >> >
> >> >
> >> > -----
> >> >
> >> > In many cases, distinguishing between critical and non-critical is
> >> easy.
> >> > A keyboard trap or a lightbox dialogue that pops up without screen
> >> > reader users becoming aware of it is a critical violation. A
> >> graphical
> >> > navigation element without alt text is one as well. But a few
> missing
> >> > paragraphs or list tags in editorial content are probably non-
> >> critical.
> >> > However, there will be a grey area where the distinction is not so
> >> easy.
> >> > But that, in my view, should not lead to the conclusion that the
> >> > distinction cannot or must not be made.
> >> >
> >> > Not sure about terms, though. Is 'error' a good term for non-
> critical
> >> > violations and 'barrier' a good term for critical violations?
> >> >
> >> > Detlev
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >
> >> >
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> ---------------------------------------------------------------
> >> Detlev Fischer PhD
> >> DIAS GmbH - Daten, Informationssysteme und Analysen im Sozialen
> >> Geschäftsführung: Thomas Lilienthal, Michael Zapp
> >>
> >> Telefon: +49-40-43 18 75-25
> >> Mobile: +49-157 7-170 73 84
> >> Fax: +49-40-43 18 75-19
> >> E-Mail: fischer@dias.de
> >>
> >> Anschrift: Schulterblatt 36, D-20357 Hamburg
> >> Amtsgericht Hamburg HRB 58 167
> >> Geschäftsführer: Thomas Lilienthal, Michael Zapp
> >> ---------------------------------------------------------------
> >
> >
> 
> 
> 
> --
> ---------------------------------------------------------------
> Detlev Fischer PhD
> DIAS GmbH - Daten, Informationssysteme und Analysen im Sozialen
> Geschäftsführung: Thomas Lilienthal, Michael Zapp
> 
> Telefon: +49-40-43 18 75-25
> Mobile: +49-157 7-170 73 84
> Fax: +49-40-43 18 75-19
> E-Mail: fischer@dias.de
> 
> Anschrift: Schulterblatt 36, D-20357 Hamburg
> Amtsgericht Hamburg HRB 58 167
> Geschäftsführer: Thomas Lilienthal, Michael Zapp
> ---------------------------------------------------------------
> 
> This e-mail is confidential. If you are not the intended recipient you
> must not disclose or use the information contained within. If you have
> received it in error please return it to the sender via reply e-mail
> and delete any record of it from your system. The information contained
> within is not the opinion of Edith Cowan University in general and the
> University accepts no liability for the accuracy of the information
> provided.
> 
> CRICOS IPC 00279B
Received on Friday, 20 January 2012 08:53:05 UTC