- From: Eric S. Johansson <esj@harvee.org>
- Date: Mon, 23 May 2011 16:26:02 -0400
- To: Bjorn Bringert <bringert@google.com>
- CC: "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
On 5/20/2011 11:07 AM, Bjorn Bringert wrote: > On Fri, May 20, 2011 at 3 > It sounds like you want general APIs for accessing data in web apps. > That sounds like a good idea, but doesn't really have very much to do > with speech as far as I can tell. To make this a bit more concrete, > perhaps you could propose some APIs that you would like web browsers > to implement? > apologies for taking so long to respond. There is a short answer and the long answer to your question. Short answer today. The interface is relatively simple. It's a classic setter/getter plus a bidirectional event mechanism. The external application gets some values, sets some values, and receives an event notification if a watch valued changes or sends an event notification if something is changed. The user application provides three ways to "view" the data which are the entire set of data, the data displayed, and data selected. In any accessibility interface, there are three components, the user application, accessibility mechanism, and the interface bridge. The interface bridge is the conduit and conversion to/from presentation for user application data and the accessibility mechanism. The reason for this three components split is political/economics. It minimizes the effort on the part of the application vendor and the accessibility mechanism vendor. It puts most of the responsibility for the bridge between the two on the end-user. In practice I expect vendors or an organization dedicated to accessibility would supply a reference implementation that the end user could customize. When the user types data into the user application if the interface bridge is listening, it would receive events telling it that the data has changed. If the accessibility mechanism is speech recognition, dictating some text index the text or some transformation of it into the user application buffer. Data changes are not the only events the interface bridge receives. When the user application first receives focus, it notifies the interface bridge of the event. Interface bridge would then use this information about what has focus to set up the accessibility interface with the right context. For example, activating a grammar for speech recognition engine. that's basically it in a nutshell. There are lots of other things such as cursors, selection of focus, selecting regions etc. but I wanted to get you the basic concept. I'll follow with a longer answer containing more detail in a few days. I do want to address one point which is how it ties into speech. As soon as you glue speech recognition into an application, you potentially eliminate its usefulness for accessibility. You would need to incorporate the equivalent of what I've described here into every application independently. But if you have an API in application that accessibility bridge can make use of and let the accessibility bridge have the responsibility for speaking to the speech recognition engine, you potentially lower the cost of enabling an application and make it more customizable for the "statistical outlier" user. Case in point. Enabling every single application in Google apps is going to be a big hairy deal. But instead, if you made each one it's public information available via some API (get ("current_cursor_position", event_when_changed) then you may be able to reuse the interface bridge to provide simple speech recognition capability and be able to build on that for greater levels accessibility. --- eric
Received on Monday, 23 May 2011 20:26:58 UTC