Re: Aim and impact of random sampling

Hi, Eric.

I have no documented data (I suppose that I could go through all our 
reports and obtain something, but I guess it would be a hard job, since 
the random pages were not specifically marked as such). However, as far 
as I can remember all evaluations that I've performed in my five years 
at Technosite repeated the same types of barriers across all pages of 
the sample, or at least in a significant number of pages.

The sample size was normally 30 pages, and usually the first 20-25 were 
manually selected by one of the "structured" methods. Then the sample 
was completed with 5-10 random pages to complete the 30 pages. I must 
admit that these "random" pages were not always so random, but more or 
less chosen from "random clicks", although sometimes we used WGET to 
download about 500 pages and selected some real random pages from there.

My experience is the same that Detlev mentioned: the random pages (the 
last third of the sample) were not significantly different from the rest 
of the structured pages, since most of the problems are repeated. Even 
if there are specific barriers, they are usually covered by 2 or 3 of 
the structured pages.

Regards,
Ramón.

Eric wrote:

> @Detlev: I see your point, but this wouldn't this only work if there is a re-test?
> 
> @Ramon: Do you have data to support the conclusion that no significant change in the results will be obtained if the sample includes random pages? That would be a good input for our discussion. 
> 
> Kindest regards,
> 
> Eric
> 
> ________________________________________
> Van: Detlev Fischer [detlev.fischer@testkreis.de]
> Verzonden: donderdag 24 januari 2013 21:26
> Aan: Ramón Corominas
> CC: public-wai-evaltf@w3.org
> Onderwerp: Re: Aim and impact of random sampling
> 
> Ensuring that clients will render their entire site accessible since they do not know what exact pages will be tested is important. But setting up the rule (once proposed by Léonie, I believe) that in any re-test after remedial action, some pages are replaced by other pages would do the same trick. No need for randomness here.
> 
> For all cases of testing where we will not fimd 100% conformance (the overwhelming majority of sites, in our experience), having extra random pages as a verification exercise wouldn't make much difference - these would usually just reveal yet other instances of some SC not met that are not met anyway elsewhere. The verification aim Eric alluded to in his mail would mainly apply to those rare sites that are picture-perfect paragons of full compliance.
> 
> On 24 Jan 2013, at 20:55, Ramón Corominas wrote:
> 
>> Although I did not use the words "optional/mandatory", I also commented in the survey that some Euracert partners will probably dislike the idea of having to include more pages (= more time and resources), since they consider that the initial structured sampling is enough in most cases, (that is, no significant change in the results will be obtained).
>>
>> We at Technosite include the "random" part just because the website is evaluated over time, and thus we make clear to the clients that the sample will not always be the same, and therefore they will have to apply the accessibility criteria to the whole website. However, I agree that our "method" to select random pages is certainly not very scientific.
>>
>> In any case, I assume that the "filter the sample" should be enough to eliminate the problem of time/resources. However,
>>
>> My vote: it should be an optional step.
>>
>> Regards,
>> Ramón.
>>
>> Aurélien wrote:
>>
>>> +1 that the sense of the comment I made on the survey I think this need to be an option
>>>
>>> Detlev wrote:
>>>
>>>> The assumption has been that an additional random sample will make sure that a tester's intitial sampling of pages has not left out pages that may expose problems no present in the intitial sample.
>>>>
>>>> That aim in itself is laudable, but for this to work, the sampling would need to be
>>>>
>>>> 1. independent of individual tester choices (i.e., automatic) -
>>>>   which would need a definition, inside the methodology, of a
>>>>   valid approach for truly random sampling. No one has even hinted on
>>>>   a reliable way to do that - I believe there is none.
>>>>   A mere calculaton of sample size for a desired level of confidence
>>>>   would need to be based to the total number of a site's pages *and*
>>>>   page states - a number that will usually be unknown.
>>>>
>>>> 2. Fairly represent not just pages, but also page states.
>>>>   But crawling a site to derive a collection of URLS for
>>>>   random sampling is not doable since many states (and there URLs or
>>>>   DOM states) only come about as a result of human input.
>>>>
>>>> I hope I am not coming across as a pest if I say again that in my opinion, we are shooting ourselves in the foot if we make random sampling a mandatory part of the WCAG-EM. Academics will be happy, practitioners working to a budget will just stay away from it.

Received on Thursday, 24 January 2013 22:53:59 UTC