W3C home > Mailing lists > Public > wai-xtech@w3.org > July 2007

Re: reCAPTCHA: audio alternatives and audio formats

From: Ben Maurer <bmaurer@andrew.cmu.edu>
Date: Tue, 17 Jul 2007 10:10:11 -0700 (PDT)
To: "Gregory J. Rosmaita" <oedipus@hicom.net>
cc: wai-xtech@w3.org, Al Gilman <Alfred.S.Gilman@IEEE.org>, Colin McMillen <mcmillen@cs.cmu.edu>
Message-ID: <Pine.LNX.4.64-044.0707171001010.17669@unix33.andrew.cmu.edu>

Hey,

On Tue, 17 Jul 2007, Gregory J. Rosmaita wrote:

> aloha, all!
>
> concerning the cascade of aural CAPTCHA equivalents, i have pondered
> the issue of baseline audio formats for quite some time, but can't
> find the post i was preparing on the topic, so i'll just simply ask:
> what is the cascade order for aural filetypes in the real world
> today?
>
>   .ogg
>   .mp3
>   .au

AFAIK, no brwoser has support for a cascade like this. Doesn't make that 
topic somewhat moot.

> 1) is .au the baseline for web delivered audio content?

.wav would be a good baseline. We initially considered serving up .wav 
reCAPTCHAs, however:

1. Bandwidth was potentially prohibitive (25 KB mp3 vs 250 KB wav)
2. Disk space was also prohibitive
3. mp3 was universal enough for us.

> 2) is .ogg widely enough supported to retain first place in the cascade?

There's no support for ogg in default windows and mac installs.
>
> 4) are .mp3 files capable of being played using the user agent or
> operating system's default sound renderer?

AFAIK, all audio files are played by browsers with support of plugins. 
Ogg, mp3 and wav are equal in this sense.

> has any thought gone into serving up pure sounds (dog bark, duck quack,
> train whistle) to defeat the voice-recognition?  this would involve a
> quick lookup for the appropriate answer in lang="x" or, through
> content negotiation serve up a version of the CAPTCHA that corresponds
> to the requesting user agent's language preferences, which would make
> the verification look-up faster...

Some issues here:

- WE would still need quite a bit of distortion & background noise
- There are multiple answers (dog barking -- bark, dog, dog bark, woof)
- If multi-word answers are used, segmentation of the answers becomes 
hard.
- Much harder to translate
- Need to do spelling correction
- Requires better understanding of language at hand
- Need to acquire a large number of sounds.

-b
Received on Tuesday, 17 July 2007 17:10:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 13:15:43 GMT