Re: 'Anti Robot' Registrations from Charles McCathieNevile on 2004-01-25 (w3c-wai-ig@w3.org from January to March 2004)

From: Charles McCathieNevile <charles@sidar.org>
Date: Sun, 25 Jan 2004 01:27:32 +0100
To: Matthew Smith <matt@kbc.net.au>
Cc: WAI Interest Group <w3c-wai-ig@w3.org>
Message-Id: <43DF5CF8-4ECD-11D8-8D4B-000A958826AA@sidar.org>

Hi Matthew,

there are a number of strategies that could be combined. Essentially 
the problem is one of security, and with the possible exception of 
codes developed through quantum computing technology (which really is 
currently vapourware - there are a handful of research projects, but 
you can't buy or build a system at home), security systems can all be 
cracked.

Based on the club-lock theory (don't worry about the overall problem, 
just make yourself harder to crack than the next guy) one would predict 
that natural language processing for this particular problem would be 
solved pretty fast. Actually character recognition is improving to the 
point where the particular algorithms applied for these images can be 
defeated. The problem is not just that the images have no alt text, but 
that they are obscured to defeat OCR software. The software is better 
than some people's vision. And one of the two is getting better.

One could posit a range of questions, to require more effort from the 
programmers. Of course it also requires more effort from teh person who 
just wants to send email, register to use online communities, or what 
have you.

The PF group managed to publish a document on this topic recently: 
http://www.w3.org/TR/2003/WD-turingtest-20031105/

I think it is over-optimistic, but it's only a working draft. Comments 
are invited to the xtech list, which is publicly archived 
http://lists.w3.org/Archives/Public/wai-xtech

Cheers

Chaals

On 25 Jan 2004, at 00:33, Matthew Smith wrote:

> I have seen on this list, on more than one occassion, discussions 
> about the technique used by Yahoo and others to prevent 'robot' 
> registrations by presenting a graphic of a word or number that has to 
> be keyed into a form.
>
> As there is no appropriate alt text to these images (which would 
> defeat the point, making them machine-readable), this obviously 
> constitues an accessibility problem.
>
> This may have been done before, but what I propose for such an 
> application is this:
>
> A sentence is selected at random, say from a book (Project Gutenberg 
> text?) or a series of random words.
>
> An input field with the message "please select the fifth word from the 
> following sentence" is displayed, followed by the sentence/words.  It 
> would probably be useful to put an anchor by the input field so that 
> the user could skip back to it easily once they had read the 
> sentence/words.
>
> I can see that a 'robot' could be programmed to recognise ordinals, 
> but by varying the text, it would make it harder for 'robot' 
> programmers to make sense of what is being asked.
>
--
Charles McCathieNevile                          Fundación Sidar
charles@sidar.org                                http://www.sidar.org

Received on Saturday, 24 January 2004 19:29:20 UTC