- From: JOHNSTON, MICHAEL J (MICHAEL J) <johnston@research.att.com>
- Date: Thu, 2 Oct 2008 15:56:32 -0400
- To: <www-multimodal@w3.org>
- Message-ID: <0C50B346CAD5214EA8B5A7C0914CF2A401365B67@njfpsrvexg3.research.att.com>
Many thanks for your support of EMMA. The specific comments your bring up have been discussed in detail by the EMMA subgroup and they have formulated the following responses. Could you please confirm on the public list, www-multimodal@w3.org if this resolution of the issues is acceptable. 2.1 Recommend clarifying the spec on semantics of start and end times for text input RESPONSE: We agree that this should be clarified but would like to defer this to a later version of EMMA. There are a number of issues that need to be considered, for example, whether there is a difference between the semantics of timing for typed text, cut and paste text, or text input from a file. 2.2. (from updated report) test assertion 801 is inconsistent with the specification RESPONSE: We agree and have removed this test assertion. best Michael Johnston on behalf of the EMMA subgroup Conversational Technologies strongly supports the Extensible MultiModal Annotation 1.0 (EMMA) standard. By providing a standardized yet extensible and flexible basis for representing user input, we believe EMMA has tremendous potential for making possible a wide variety of innovative multimodal applications. In particular, EMMA provides strong support for applications based on user inputs in human language in many modalities, including speech, text and handwriting as well as visual modalities such as sign languages. EMMA also supports composite multimodal interactions in which several user inputs in two or more modalities are integrated to represent a single user intent. The Conversational Technologies EMMA implementations are used in tutorials on commercial applications of natural language processing and spoken dialog systems. We report on two implementations. The first is an EMMA producer (NLWorkbench) which is used to illustrate statistical and grammar-based semantic analysis of speech and text inputs. The second implementation is an EMMA consumer, specifically a viewer for EMMA documents. The viewer can be used in the classroom to simplify examination of EMMA results as well as potentially in commercial applications for debugging spoken dialog systems. In addition, the viewer could also become the basis of an editor which would support such applications as human annotation of EMMA documents to be used as input to machine learning applications. For most of the EMMA structural elements the viewer simply provides a tree structure mirroring the XML markup. The most useful aspects of the viewer are probably the graphical representation for EMMA lattices, the ability to see timestamps as standard dates and the computed durations from EMMA timestamps. The two implementations will be made available in the near future as open source software. Deborah Dahl, Conversational Technologies
Received on Thursday, 2 October 2008 19:57:11 UTC