W3C home > Mailing lists > Public > public-wai-evaltf@w3.org > March 2012

AW: AW: Goodness criteria

From: Kerstin Probiesch <k.probiesch@googlemail.com>
Date: Thu, 22 Mar 2012 16:05:44 +0100
To: "'Shadi Abou-Zahra'" <shadi@w3.org>
Cc: <public-wai-evaltf@w3.org>
Message-ID: <4f6b3fb8.2266b40a.5735.3d74@mx.google.com>
Hi Shadi,

I try my best.

The mentioned article is about goodness criteria in qualitative studies. (there is a long going dispute over methods between researchers who are doing quantitative research and researchers who are doing qualitative research). One can for sure discuss if those goodness criteria should have that much relevance in _qualitative_ research.
 
But: evaluating websites is not qualitative. Evaluating websites belongs to the quantitative field and in the quantitative field there is no question and no discussion at all about the relevance of reliability, objectivity and validity.

Sorry, it seems today my english is more worse than every before.

Best

Kerstin








> -----Ursprüngliche Nachricht-----
> Von: Shadi Abou-Zahra [mailto:shadi@w3.org]
> Gesendet: Donnerstag, 22. März 2012 15:02
> An: Kerstin Probiesch
> Cc: public-wai-evaltf@w3.org
> Betreff: Re: AW: Goodness criteria
> 
> Hi Kerstin,
> 
> I must admit that I have difficulty understanding your specific
> suggestion or request, despite having read it several times.
> 
> Would you mind rephrasing your comment more clearly?
> 
> Thanks,
>    Shadi
> 
> 
> On 22.3.2012 14:32, Kerstin Probiesch wrote:
> > Hi Detlev, all,
> >
> > evaluating websites or pages without standardized methodology is for
> me nearly worthless. Users and also clients have not only an interest
> but the right that websites are tested with reliable tests and that
> accessibility means the same in every country and doesn't depend on the
> personal interpretation.
> >
> > Anyway. The link is very interesting and written from the perspective
> of qualitative social research: "When it comes to discussing goodness
> or quality criteria of the (qualitative) social sciences…". The article
> was published at Forum: Qualitative Social Research and is I believe
> part of the apologetic scientific literature of qualitative researchers
> in the context of the dispute over methods between scientists doing
> quantitative and those doing qualitative research.
> >
> > Nothing is "wrong" with that. Qualitative research is about people
> and typical methods are interviews (narrative, problem-centered,..) for
> example in the case of ethnographic field studies where great
> researchers like Malinowski did fundamental research – especially on
> Participant observation.
> >
> > Very interesting would be results of qualitative interviews with web
> developers about why some are advocating accessibility and others not.
> Or how they see their own knowledge about accessibility.
> >
> > But: evaluating websites is not qualitative social research.
> >
> > Kerstin
> >
> >> -----Ursprüngliche Nachricht-----
> >> Von: detlev.fischer@testkreis.de
> [mailto:detlev.fischer@testkreis.de]
> >> Gesendet: Donnerstag, 22. März 2012 12:40
> >> An: public-wai-evaltf@w3.org
> >> Betreff: Goodness criteria
> >>
> >> Hi list,
> >>
> >> just a few words about Kerstin's request to bring goodness criteria
> >> into the section 1.1 on scope.
> >>
> >> I#m not sure what this inclusion will add for those applying the
> >> methodology when conducting tests (or defining test procedures for
> >> others to follow).
> >>
> >> Here are my 2 cents on the three terms objectivity, validity,
> >> reliability:
> >>
> >> Objectivity
> >> Normally this refers to minimising individual (inter-evaluator)
> >> differences in observation or judgement.
> >> While we can objectively measure temperature, dimensions, etc. based
> on
> >> normative scales, there are several factors that make objectivity
> >> little more than an ideal that can be approached but never reached
> in
> >> website evaluation. Several aspects contribute to that:
> >>
> >> 1. Evaluators have different backgrounds and dispositions. One can
> try
> >> to minimise
> >> these differences by uniform curricula and training, and in
> dialogues
> >> aimed at a
> >> consensual adjustment of judgements in typical cases.
> >>
> >> 2. Web content out there is complex and often fails to fit the
> patterns
> >> described in
> >> documented techniques. There is nothing we can do about that :-)
> >>
> >> 3. The rating of Success Criteria is often not stricty independent
> of
> >> other SC.
> >> Instances can fail several SC at the same time, and context must be
> >> taken into
> >> account to judge instances. How that is done will often vary across
> >> evaluators.
> >>
> >> Validity
> >> The validity of an evaluation is ultimately the degree to which an
> >> evaluation result reflects the actual degree of accessibility across
> >> users with disabilities. So there is a strong temporal element here.
> >> The validiy of assessments will depend, for example, on the current
> >> degree of accessibility support of techniques used to claim
> >> conformance. As the web changes and relevant accessibility
> techniques
> >> change with it, maintaining validity means maintaining the
> timeliness
> >> and relevance of the techniques and failures that operationalize the
> >> general success criteria (or, if a tester wants to avoid any
> reference
> >> to documented techniques, maintaining the knowledge of what is
> currenty
> >> supported and what is not, or not yet).
> >> As WCAG-EM just references techniques maintained outside its scope,
> I
> >> wonder whether it is the right place to cover validity.
> >>
> >> Reliability
> >> Reliability seems to depend on several aspects:
> >>
> >> 1. the knowledge, diligence and amout of time invested by the
> >> individual evaluator
> >> across all relevant steps
> >>
> >> 2. the degree of operationalization: the more prescriptive the test
> >> procedure, the
> >> higher the likelihood of replicability. As WCAG-EM will not (for
> good
> >> reasons)
> >> go into detail regarding tools or particular procedures based on
> tools,
> >> I doubt
> >> that WCAM-EM alone can safeguard replicability (which might be the
> job
> >> of more
> >> prescriptive procedures based on it)
> >>
> >> 3. The amount of testers carrying out the same test (re-test,
> >> replicate) or the
> >> availability of additional quality assurance - again something
> probably
> >> to be
> >> defined beyond the scope of WCAG-EM
> >>
> >> As a last comment, I am not convinced that "goodness criteria are
> >> defined and internationally agreed in the scientific community"
> means
> >> that these are a given that can simply be referenced and taken for
> >> granted. This may be true for hard sciences, but an evaluation is
> >> subject to many 'soft' social and contextual aspects. One should aim
> to
> >> keep these in check, but it is impossible to eliminate them
> entirely.
> >> Instead, they must be managed. Perhaps  this article has some useful
> >> pointers:
> >>
> >> http://www.qualitative-
> research.net/index.php/fqs/article/view/919/2008
> >>
> >> Conclusion
> >> Why I think mentioning the goodness criteria in the section on scope
> >> probably does no harm, I am not convinced that this will improve the
> >> way WCAG-EM is used. It could be useful, however, to give guidance
> on
> >> how to approach or improve the aims of objectivivity, validity,
> >> reliability in practical terms. Whether such guidance can be
> >> prescriptive for operational procedures based on WCAG-EM, I am not
> so
> >> sure about. Let's dicuss...
> >>
> >> Best regards,
> >> Detlev
> >>
> >> --
> >> testkreis c/o feld.wald.wiese
> >> Borselstraße 3-7 (im Hof), 22765 Hamburg
> >>
> >> Mobil +49 (0)1577 170 73 84
> >> Tel +49 (0)40 439 10 68-3
> >> Fax +49 (0)40 439 10 68-5
> >>
> >> http://www.testkreis.de
> >> Beratung, Tests und Schulungen für barrierefreie Websites
> >>
> >>
> >>
> >
> >
> >
> >
> 
> --
> Shadi Abou-Zahra - http://www.w3.org/People/shadi/
> Activity Lead, W3C/WAI International Program Office
> Evaluation and Repair Tools Working Group (ERT WG)
> Research and Development Working Group (RDWG)
Received on Thursday, 22 March 2012 15:06:05 GMT

This archive was generated by hypermail 2.3.1 : Friday, 8 March 2013 15:52:13 GMT