- From: Marc Schroeder <marc.schroeder@dfki.de>
- Date: Thu, 09 Dec 2010 08:56:22 +0100
- To: Bjorn Bringert <bringert@google.com>
- CC: "Young, Milan" <Milan.Young@nuance.com>, Satish Sampath <satish@google.com>, Robert Brown <Robert.Brown@microsoft.com>, Dave Burke <daveburke@google.com>, public-xg-htmlspeech@w3.org
Hi Bjorn, On 07.12.10 21:11, Bjorn Bringert wrote: > The > things that I hope the XG will deliver are: > > 1. A draft spec of a web app API for using a speech recognizer > provider by the browser, with implementations in several browsers. > > 2. A draft spec of a web app API for using a speech synthesizer > provider by the browser, with implementations in several browsers. > > 3. Requirements and change requests to other working groups or > incubator groups to make sure that APIs such as Device, Audio and > XmlHttpRequest work for network speech services. This is completely > independent of 1 and 2. To ensure that the requested features are > sufficient, there should be several demo systems using those APIs for > speech. I may be misunderstanding you, but to my mind there is an important link missing between your items 1+2 and 3: how to make network speech services work via *the same API* as the browser's default speech service? We have pointed out requirements which indicate that we want to allow this: - FPR7. Web apps should be able to request speech service different from default. - FPR12. Speech services that can be specified by web apps must include network speech services. Now let's assume for the moment we would go for a <tts> element like you suggested, which extends HTMLMediaElement. With your items 1-3, how as a web app author would I use that <tts> element and tell it to get its speech from a TTS engine on the network? In other words, in order for the web app to use a networked speech service rather than the built-in one, most of the markup / scripts should stay the same, and only the reference to the speech service should have to change. I imagine the browser will have to facilitate this in some way, which would mean that we are *not* talking about a protocol just between the web app and the speech service... any thoughts? Cheers, Marc -- Dr. Marc Schröder, Senior Researcher at DFKI GmbH Coordinator EU FP7 Project SEMAINE http://www.semaine-project.eu Project leader for DFKI in SSPNet http://sspnet.eu Project leader PAVOQUE http://mary.dfki.de/pavoque Associate Editor IEEE Trans. Affective Computing http://computer.org/tac Editor W3C EmotionML Working Draft http://www.w3.org/TR/emotionml/ Portal Editor http://emotion-research.net Team Leader DFKI TTS Group http://mary.dfki.de Homepage: http://www.dfki.de/~schroed Email: marc.schroeder@dfki.de Phone: +49-681-85775-5303 Postal address: DFKI GmbH, Campus D3_2, Stuhlsatzenhausweg 3, D-66123 Saarbrücken, Germany -- Official DFKI coordinates: Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313
Received on Thursday, 9 December 2010 07:56:57 UTC