Sound similarities from by way of Harvey Bingham on 1999-11-10 (w3c-wai-ua@w3.org from October to December 1999)

From: by way of Harvey Bingham <hbingham@acm.org>
Date: Wed, 10 Nov 1999 13:34:33 -0500
To: w3c-wai-ua@w3.org
Message-Id: <4.2.0.58.19991110133025.00a02510@pop.tiac.net>

We haven't particularly addressed sound similarities. Steve Anderson, of
Dragon Systems, raises an issue we haven't particularly addressed.

Jon,

    Thanks for the invitation to participate.  I am not the right person at
Dragon to do so, but am trying to figure out who would be.  There is great
interest here in making the web more navigable by speech.

     Having said that I'm not qualified to comment, let me plow ahead anyway:

    The needs of speech developers will be partially met by making all content
have a text equivalent.  As you point out, this enables screen readers to 
access
it via speech synthesis.  However, text written with speech synthesis in mind
might not be good for speech recognition.  For example, if two icons have tags
"Picture B" and "Picture D", a good speech synthesizer would have no trouble.
But the "e" set of letters (e.g. b, d, e, etc.) are notoriously hard for a
recognition system to distinguish because they're so similar.  Not a great
example, perhaps, but you get the idea.

     The other thought that comes to mind is that audio files could be
automatically transcribed by a speech recognition system.  In the section
"Continuous Equivalent Track" developers are encouraged to provide transcripts,
but no mention is made that these transcripts might be generated on the 
fly.  It
might help if audio clips were labeled as to whether they contain speech, 
and if
so, in what language.  (I suppose other side info could be useful too).  This
side information could speed up the speech recognizer's job.

     Steve

Received on Wednesday, 10 November 1999 13:35:23 UTC