- From: Michael[tm] Smith <mike@w3.org>
- Date: Sat, 31 Dec 2011 11:45:32 +0900
- To: Noah Mendelsohn <nrm@arcanedomain.com>
- Cc: "www-tag@w3.org" <www-tag@w3.org>, Amy van der Hiel <amy@w3.org>, Philippe Le Hegaret <plh@w3.org>, Norm Walsh <ndw@nwalsh.com>, Mark Nottingham <mnot@mnot.net>
For the HTML.next agenda item, there's relevant additional work that's recently been published: the HTML Speech Incubator Group's final report -- specifically the "Web Speech API proposal" part: http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#speechwebapi That part proposes new features to enable Web apps to use speech to interact with users -- both through speech recognition and speech synthesis (text to speech). Specifically, it proposes the following: * a new <reco> element for speech input (recognition) in a user interface * a new <tts> element for synthesized speech output (TTS audio stream) * a Web Speech API for JS script control of speech recognition and synthesis * a Web Speech Protocol (WebSockets-based) for use with remote speech services The fundamental use cases are essentially: users can speak to fill forms, control page navigation and so on within Web apps (speech input), and Web apps can speak information to users (speech output -- synthesized speech, not pre-recorded). The actual speech behavior is provided by speech recognition and synthesis services that the browser interacts with, and the idea is that browsers can have settings for default speech services -- which may be remote services or may be "local" services built into the browsers or built into the OS/ platform/device the browser runs on -- but the APIs also provide flexibility/ control such that Web apps can then override those defaults by specifying any alternate speech services (ones supporting the Web Speech Protocol). --Mike -- Michael[tm] Smith http://people.w3.org/mike/+
Received on Saturday, 31 December 2011 02:45:39 UTC