HTML.next: Web Speech API [was: Agenda for TAG F2F 4-6 January 2012]

For the HTML.next agenda item, there's relevant additional work that's
recently been published: the HTML Speech Incubator Group's final report --
specifically the "Web Speech API proposal" part:

  http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#speechwebapi

That part proposes new features to enable Web apps to use speech to
interact with users -- both through speech recognition and speech synthesis
(text to speech).

Specifically, it proposes the following:

* a new <reco> element for speech input (recognition) in a user interface
* a new <tts> element for synthesized speech output (TTS audio stream)
* a Web Speech API for JS script control of speech recognition and synthesis
* a Web Speech Protocol (WebSockets-based) for use with remote speech services

The fundamental use cases are essentially: users can speak to fill forms,
control page navigation and so on within Web apps (speech input), and Web
apps can speak information to users (speech output -- synthesized speech,
not pre-recorded).

The actual speech behavior is provided by speech recognition and synthesis
services that the browser interacts with, and the idea is that browsers can
have settings for default speech services -- which may be remote services
or may be "local" services built into the browsers or built into the OS/
platform/device the browser runs on -- but the APIs also provide flexibility/
control such that Web apps can then override those defaults by specifying
any alternate speech services (ones supporting the Web Speech Protocol).

  --Mike

-- 
Michael[tm] Smith
http://people.w3.org/mike/+

Received on Saturday, 31 December 2011 02:45:39 UTC