AW: Goodness criteria

Hi Detlev, all,

evaluating websites or pages without standardized methodology is for me nearly worthless. Users and also clients have not only an interest but the right that websites are tested with reliable tests and that accessibility means the same in every country and doesn't depend on the personal interpretation.

Anyway. The link is very interesting and written from the perspective of qualitative social research: "When it comes to discussing goodness or quality criteria of the (qualitative) social sciences…". The article was published at Forum: Qualitative Social Research and is I believe part of the apologetic scientific literature of qualitative researchers in the context of the dispute over methods between scientists doing quantitative and those doing qualitative research. 

Nothing is "wrong" with that. Qualitative research is about people and typical methods are interviews (narrative, problem-centered,..) for example in the case of ethnographic field studies where great researchers like Malinowski did fundamental research – especially on Participant observation.

Very interesting would be results of qualitative interviews with web developers about why some are advocating accessibility and others not. Or how they see their own knowledge about accessibility. 

But: evaluating websites is not qualitative social research. 

Kerstin

> -----Ursprüngliche Nachricht-----
> Von: detlev.fischer@testkreis.de [mailto:detlev.fischer@testkreis.de]
> Gesendet: Donnerstag, 22. März 2012 12:40
> An: public-wai-evaltf@w3.org
> Betreff: Goodness criteria
> 
> Hi list,
> 
> just a few words about Kerstin's request to bring goodness criteria
> into the section 1.1 on scope.
> 
> I#m not sure what this inclusion will add for those applying the
> methodology when conducting tests (or defining test procedures for
> others to follow).
> 
> Here are my 2 cents on the three terms objectivity, validity,
> reliability:
> 
> Objectivity
> Normally this refers to minimising individual (inter-evaluator)
> differences in observation or judgement.
> While we can objectively measure temperature, dimensions, etc. based on
> normative scales, there are several factors that make objectivity
> little more than an ideal that can be approached but never reached in
> website evaluation. Several aspects contribute to that:
> 
> 1. Evaluators have different backgrounds and dispositions. One can try
> to minimise
> these differences by uniform curricula and training, and in dialogues
> aimed at a
> consensual adjustment of judgements in typical cases.
> 
> 2. Web content out there is complex and often fails to fit the patterns
> described in
> documented techniques. There is nothing we can do about that :-)
> 
> 3. The rating of Success Criteria is often not stricty independent of
> other SC.
> Instances can fail several SC at the same time, and context must be
> taken into
> account to judge instances. How that is done will often vary across
> evaluators.
> 
> Validity
> The validity of an evaluation is ultimately the degree to which an
> evaluation result reflects the actual degree of accessibility across
> users with disabilities. So there is a strong temporal element here.
> The validiy of assessments will depend, for example, on the current
> degree of accessibility support of techniques used to claim
> conformance. As the web changes and relevant accessibility techniques
> change with it, maintaining validity means maintaining the timeliness
> and relevance of the techniques and failures that operationalize the
> general success criteria (or, if a tester wants to avoid any reference
> to documented techniques, maintaining the knowledge of what is currenty
> supported and what is not, or not yet).
> As WCAG-EM just references techniques maintained outside its scope, I
> wonder whether it is the right place to cover validity.
> 
> Reliability
> Reliability seems to depend on several aspects:
> 
> 1. the knowledge, diligence and amout of time invested by the
> individual evaluator
> across all relevant steps
> 
> 2. the degree of operationalization: the more prescriptive the test
> procedure, the
> higher the likelihood of replicability. As WCAG-EM will not (for good
> reasons)
> go into detail regarding tools or particular procedures based on tools,
> I doubt
> that WCAM-EM alone can safeguard replicability (which might be the job
> of more
> prescriptive procedures based on it)
> 
> 3. The amount of testers carrying out the same test (re-test,
> replicate) or the
> availability of additional quality assurance - again something probably
> to be
> defined beyond the scope of WCAG-EM
> 
> As a last comment, I am not convinced that "goodness criteria are
> defined and internationally agreed in the scientific community" means
> that these are a given that can simply be referenced and taken for
> granted. This may be true for hard sciences, but an evaluation is
> subject to many 'soft' social and contextual aspects. One should aim to
> keep these in check, but it is impossible to eliminate them entirely.
> Instead, they must be managed. Perhaps  this article has some useful
> pointers:
> 
> http://www.qualitative-research.net/index.php/fqs/article/view/919/2008
> 
> Conclusion
> Why I think mentioning the goodness criteria in the section on scope
> probably does no harm, I am not convinced that this will improve the
> way WCAG-EM is used. It could be useful, however, to give guidance on
> how to approach or improve the aims of objectivivity, validity,
> reliability in practical terms. Whether such guidance can be
> prescriptive for operational procedures based on WCAG-EM, I am not so
> sure about. Let's dicuss...
> 
> Best regards,
> Detlev
> 
> --
> testkreis c/o feld.wald.wiese
> Borselstraße 3-7 (im Hof), 22765 Hamburg
> 
> Mobil +49 (0)1577 170 73 84
> Tel +49 (0)40 439 10 68-3
> Fax +49 (0)40 439 10 68-5
> 
> http://www.testkreis.de
> Beratung, Tests und Schulungen für barrierefreie Websites
> 
> 
> 

Received on Thursday, 22 March 2012 13:32:22 UTC