- From: Deborah Dahl <dahl@conversational-technologies.com>
- Date: Mon, 25 Apr 2005 14:51:26 -0400
- To: <public-tt@w3.org>
- Cc: "W3C Multimodal group" <w3c-mmi-wg@w3.org>
Dear Timed Text Working Group, In response to your request [1] for Last Call comments on "Timed Text (TT) Authoring Format 1.0 - Distribution Format Exchange Profile (DFXP)", the Multimodal Interaction Working Group has reviewed the document from our perspective, in particular considering how timed text might be incorporated into multimodal applications. The Multimodal Working Group has not an objection, but an observation to make about the Timed Text Group's last call working draft. Timed Text would be easier to use as part of multimodal interfaces if it had a means of handling external asynchronous events. Such events are the standard means of coordinating among modalities in multimodal situations. Consider a multimodal interface that is using Timed Text and text-to-speech simultaneously to prompt the user, while using speech recognition to gather the user's response. Using ttp:timeBase, the text to speech output can be synchronized with the Timed Text display. However, when the user starts speaking, the multimodal interface would normally want to stop the text to speech play and alter, if not stop, the Timed Text display to indicate that it is now listening to the user. Obviously, the timing of the user's utterance can't be known in advance, so the normal way to do this is to generate a 'speech-detected' or 'barge-in' event, which is then delivered to all the modalities where it is caught by appropriate event handlers. (The event handler for text to speech would halt the current text to speech play. A corresponding handler for Timed Text might flash the display or halt it or make it change colors.) In the current specification, there is no apparent way to handle this event in Timed Text markup. This gap does not indicate an inherent weakness in the Timed Text specification, but we think that it will limit the usefulness of Timed Text in multimodal interfaces. If you would like more information about the overall multimodal architecture that we're envisioning as a potential container for timed text, you may find our MMI Architecture document useful [3]. We would be happy to discuss our observation in more detail if you have any questions or comments. best regards, Debbie Dahl, MMI WG Chair [1] Request for Last Call Comments: http://lists.w3.org/Archives/Member/chairs/2005JanMar/0118.html [2] TT Authoring Format: http://www.w3.org/TR/2005/WD-ttaf1-dfxp-20050321/ [3] MMI Architecture: http://www.w3.org/TR/2005/WD-mmi-arch-20050422/
Received on Monday, 25 April 2005 18:51:46 UTC