Re: [HTML Speech] Let's get started! from Eric S. Johansson on 2010-09-08 (public-xg-htmlspeech@w3.org from September 2010)

From: Eric S. Johansson <esj@harvee.org>
Date: Wed, 08 Sep 2010 10:55:45 -0400
To: Satish Sampath <satish@google.com>
CC: Olli@pettay.fi, Dan Burnett <dburnett@voxeo.com>, public-xg-htmlspeech@w3.org
Message-ID: <4C87A3F1.3000807@harvee.org>
  On 9/8/2010 10:26 AM, Satish Sampath wrote:
> Hi Olli,
I've been reading a bit of the background material and I admit I still to read a 
bit more. My first question is

How many of you live with speech recognition like I do? I should be living 
without a keyboard but, way too much of the world is hostile to those of us with 
a upper extremity disorders.

I'm also wondering who is the target audience for making these changes? Is it 
tabs or, can the disabled take care of themselves? XML is one of the worst 
formats for data if you are disabled. It is something you would never try to 
speak using speech recognition simply because of the hard to pronounce character 
sequences, the requirement that you speak of one character at a time usually 
discreet utterance form to make sure you get the right recognition which is 
tantamount to speaking the keyboard. Speaking the keyboard is a well known user 
mode imposed upon the disabled with the end result that the disabled lose the 
ability to speak because it is so wearing on the throat.

This may be because I haven't read the background yet but how do you accommodate 
for editing? I'm sure I'll leave several speakos behind in this message for no 
other reason than it's impossible to crack by speech and my hands are burned out 
enough I don't want to bother navigating.

I should probably explain my background. I've been disabled for over 15 years, 
speech recognition user the entire time. Watched the industry progress from 
discrete utterance to large vocabulary continuous recognition systems. I was 
part of the crew hoping to put together and organize the first generation of 
programming by voice systems. I organized the conference is and lined up talent 
to talk to us. I was part of the now defunct open source speech recognition 
initiative program and I'm now trying desperately to figure out enough about 
natpython so that I can apply Select-and-Say features to pyscripter. Quite a 
challenge when your hands aren't functional.

Obviously I've learned quite a lot about what it's like to live with speech. 
I've learned that simple macro models are a dead-end. They're only good for 
trivial stuff. most "command-and-control interfaces are also dead ends. For the 
most part, he was my hands as broken as they are, I don't speak anything for 
command-and-control. It's too stressful to the throat.   similarly, "natural 
language" commands are pretty much useless. Too hard to discover, dangerous with 
misrecognition's, set wrong user expectations

What you need to look at is creating "written speech" interfaces for more 
complicated operations. I forget the actual terms but a written speech interface 
is one where you can dictate into some holding editor and then translate the 
dictated information to what you need. It gives you opportunities for correction 
you wouldn't otherwise have. It also lets you pause and think in the middle of 
an utterance.  For example, you could be spelling out a change directory 
command. You forget where you are in the hierarchy so when you pause, the system 
comes up with a prompt tell you what can be said and then you say it through to 
the end. While the command is in the miniature editor, you can change individual 
pathnames (because they are all spoken words), and even use additional commands 
like "remember this as <register name>" to store or retrieve the path.

Anyway, that's the direction I'm heading down with my speech user interface 
ideas. I'm hoping that I can nudge you down the path of making something usable 
by crips like me.  I highly recommend you throw away your keyboards and use 
NaturallySpeaking for everything. It's the current gold standard. It's quite 
enlightening.
Received on Wednesday, 8 September 2010 14:57:53 UTC