- From: by way of Harvey Bingham <hbingham@acm.org>
- Date: Wed, 10 Nov 1999 13:34:33 -0500
- To: w3c-wai-ua@w3.org
We haven't particularly addressed sound similarities. Steve Anderson, of Dragon Systems, raises an issue we haven't particularly addressed. Jon, Thanks for the invitation to participate. I am not the right person at Dragon to do so, but am trying to figure out who would be. There is great interest here in making the web more navigable by speech. Having said that I'm not qualified to comment, let me plow ahead anyway: The needs of speech developers will be partially met by making all content have a text equivalent. As you point out, this enables screen readers to access it via speech synthesis. However, text written with speech synthesis in mind might not be good for speech recognition. For example, if two icons have tags "Picture B" and "Picture D", a good speech synthesizer would have no trouble. But the "e" set of letters (e.g. b, d, e, etc.) are notoriously hard for a recognition system to distinguish because they're so similar. Not a great example, perhaps, but you get the idea. The other thought that comes to mind is that audio files could be automatically transcribed by a speech recognition system. In the section "Continuous Equivalent Track" developers are encouraged to provide transcripts, but no mention is made that these transcripts might be generated on the fly. It might help if audio clips were labeled as to whether they contain speech, and if so, in what language. (I suppose other side info could be useful too). This side information could speed up the speech recognizer's job. Steve
Received on Wednesday, 10 November 1999 13:35:23 UTC