DFXP LC Comments - Issue 12 Response; MMI WG from Glenn A. Adams on 2005-08-12 (public-tt@w3.org from August 2005)

From: Glenn A. Adams <gadams@xfsi.com>
Date: Fri, 12 Aug 2005 16:19:10 -0400
To: "Deborah Dahl" <dahl@conversational-technologies.com>, "W3C Multimodal group" <w3c-mmi-wg@w3.org>
Cc: "W3C Public TTWG" <public-tt@w3.org>
Message-ID: <C0A7A8BF0F28BA4B9C6C8816309E1D0702E31B@ES1.xfsi.com>

Dear Deborah and MMI WG,
	
Thank you for your comments, [1], on the DFXP Last Call Working Draft
[2].  The TT WG has concluded its review of your comments and has
agreed upon the following responses.

If you require any further follow-up, then please do so no later than
September 1, and please forward your follow-up to <public-tt@w3.org>.

Regards,
Glenn Adams
Chair, Timed Text Working Group

************************************************************************

Citations:

[1] http://lists.w3.org/Archives/Public/public-tt/2005Apr/0042.html
[2] http://www.w3.org/TR/2005/WD-ttaf1-dfxp-20050321/

************************************************************************

Comment - Issue #12 [1]; 25 Apr 2005 14:51:26 -0400

In response to your request [1] for Last Call comments on "Timed Text
(TT) Authoring Format 1.0 - Distribution Format Exchange Profile
(DFXP)", the Multimodal Interaction Working Group has reviewed the
document from our perspective, in particular considering how timed
text might be incorporated into multimodal applications.

The Multimodal Working Group has not an objection, but an observation
to make about the Timed Text Group's last call working draft.  Timed
Text would be easier to use as part of multimodal interfaces if it had
a means of handling external asynchronous events. Such events are the
standard means of coordinating among modalities in multimodal
situations.  Consider a multimodal interface that is using Timed Text
and text-to-speech simultaneously to prompt the user, while using
speech recognition to gather the user's response.  Using ttp:timeBase,
the text to speech output can be synchronized with the Timed Text
display.  However, when the user starts speaking, the multimodal
interface would normally want to stop the text to speech play and
alter, if not stop, the Timed Text display to indicate that it is now
listening to the user.  Obviously, the timing of the user's utterance
can't be known in advance, so the normal way to do this is to generate
a 'speech-detected' or 'barge-in' event, which is then delivered to
all the modalities where it is caught by appropriate event handlers.
(The event handler for text to speech would halt the current text to
speech play.  A corresponding handler for Timed Text might flash the
display or halt it or make it change colors.)  In the current
specification, there is no apparent way to handle this event in Timed
Text markup.  This gap does not indicate an inherent weakness in the
Timed Text specification, but we think that it will limit the
usefulness of Timed Text in multimodal interfaces. 

If you would like more information about the overall multimodal
architecture that we're envisioning as a potential container for timed
text, you may find our MMI Architecture document useful [3].

We would be happy to discuss our observation in more detail if you
have any questions or comments.

Response:

The TT WG believes this is more of a system integration or user agent
behavior issue outside or above the timed text media type. If the
clock to which timed text is synchronized were stopped by external
means, then the timed text would effectively stop updating. We're
therefore inclined to believe that the external control of that clock
is outside our scope.

************************************************************************

Received on Friday, 12 August 2005 20:19:16 UTC