- From: David Workman <workmad3@gmail.com>
- Date: Thu, 3 Dec 2009 13:16:25 +0000
I agree. The application should be able to choose a source for speech commands, or give the user a choice of options for a speech source. It also provides a much better separation of APIs, allowing the development of a speech API that doesn't depend on or interfere in any way with the development of a microphone/audio input device API. 2009/12/3 Diogo Resende <dresende at thinkdigital.pt> > I agree 100%. Still, I think the access to the mic and the speech > recognition could be separated. > > -- > Diogo Resende <dresende at thinkdigital.pt> > ThinkDigital > > On Thu, 2009-12-03 at 12:06 +0000, Bjorn Bringert wrote: > > On Wed, Dec 2, 2009 at 10:20 PM, Jonas Sicking <jonas at sicking.cc> wrote: > > > On Wed, Dec 2, 2009 at 11:17 AM, Bjorn Bringert <bringert at google.com> > wrote: > > >> I agree that being able to capture and upload audio to a server would > > >> be useful for a lot of applications, and it could be used to do speech > > >> recognition. However, for a web app developer who just wants to > > >> develop an application that uses speech input and/or output, it > > >> doesn't seem very convenient, since it requires server-side > > >> infrastructure that is very costly to develop and run. A > > >> speech-specific API in the browser gives browser implementors the > > >> option to use on-device speech services provided by the OS, or > > >> server-side speech synthesis/recognition. > > > > > > Again, it would help a lot of you could provide use cases and > > > requirements. This helps both with designing an API, as well as > > > evaluating if the use cases are common enough that a dedicated API is > > > the best solution. > > > > > > / Jonas > > > > I'm mostly thinking about speech web apps for mobile devices. I think > > that's where speech makes most sense as an input and output method, > > because of the poor keyboards, small screens, and frequent hands/eyes > > busy situations (e.g. while driving). Accessibility is the other big > > reason for using speech. > > > > Some ideas for use cases: > > > > - Search by speaking a query > > - Speech-to-speech translation > > - Voice Dialing (could open a tel: URI to actually make the call) > > - Dialog systems (e.g. the canonical pizza ordering system) > > - Lightweight JavaScript browser extensions (e.g. Greasemonkey / > > Chrome extensions) for using speech with any web site, e.g, for > > accessibility. > > > > Requirements: > > > > - Web app developer side: > > - Allows both speech recognition and synthesis. > > - Easy to use API. Makes simple things easy and advanced things > possible. > > - Doesn't require web app developer to develop / run his own speech > > recognition / synthesis servers. > > - (Natural) language-neutral API. > > - Allows developer-defined application specific grammars / language > models. > > - Allows multilingual applications. > > - Allows easy localization of speech apps. > > > > - Implementor side: > > - Easy enough to implement that it can get wide adoption in browsers. > > - Allows implementor to use either client-side or server-side > > recognition and synthesis. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20091203/777ea2f0/attachment.htm>
Received on Thursday, 3 December 2009 05:16:25 UTC