- From: Young, Milan <Milan.Young@nuance.com>
- Date: Thu, 18 Nov 2010 17:33:34 -0800
- To: <public-xg-htmlspeech@w3.org>
- Message-ID: <1AA381D92997964F898DF2A3AA4FF9AD09630FE6@SUN-EXCH01.nuance.com>
Hello, On the Nov 18th conference, I volunteer to send out proposed wording for a new requirement: Summary - User agents and speech services are required to support at least one common protocol. Description - A common protocol will be defined as part of the final recommendation. It will be built upon some TBD existing application layer protocol and include support for the following: * Streaming audio data (e.g. HTTP 1.1 chunking). This include both audio streamed from UA to SS during recognition and audio streamed from SS to UA during synthesis. * Bidirectional events which can occur anytime during the interaction. These events could originate either within the web app (e.g. click) or the SS (e.g. start-of-speech or mark) and must be transmitted through the UA in a timely fashion. The set of events include both standard events defined by the final recommendation and extension events. * Both standard and extension parameters passed from the web app to the speech service at the start of the interaction. List of standard parameters TBD. * EMMA results passed from the SS to the web app. The syntax of this result is TBD (e.g. XML and/or JSON). * At least one standard audio codec. UAs are permitted to advertise alternate codecs at the start of the interaction and SSs are allowed to select any such alternate (e.g. HTTP Accept). * Transport layer security (e.g. HTTPS) if requested by the web app. * Session identifier that could be used to form continuity across multiple interactions (e.g. HTTP cookies). * Interpretation over text. * Re-recognition using previous audio streams. Thank you
Received on Friday, 19 November 2010 01:34:08 UTC