Re: [emma] Conversational Technologies Implementation Reports for EMMA 1.0 from JOHNSTON, MICHAEL J (MICHAEL J) on 2008-10-02 (www-multimodal@w3.org from October 2008)

From: JOHNSTON, MICHAEL J (MICHAEL J) <johnston@research.att.com>
Date: Thu, 2 Oct 2008 15:56:32 -0400
To: <www-multimodal@w3.org>
Message-ID: <0C50B346CAD5214EA8B5A7C0914CF2A401365B67@njfpsrvexg3.research.att.com>

Many thanks for your support of EMMA. The specific comments
your bring up have been discussed in detail by the 
EMMA subgroup and they have formulated the following
responses. Could you please confirm on the public list,
www-multimodal@w3.org if this resolution of the issues is
acceptable.
 
2.1 Recommend clarifying the spec on semantics of start and end times
for text input
 
RESPONSE: We agree that this should be clarified but would like to defer
this to a later version of EMMA. There are a number of issues that need
to be considered, for example, whether there is a difference between the
semantics of timing for typed text, cut and paste text, or text input
from a file.
 
2.2. (from updated report) test assertion 801 is inconsistent with the
specification
 
RESPONSE: We agree and have removed this test assertion.
 
best
Michael Johnston
on behalf of the EMMA subgroup
 
 
 
Conversational Technologies strongly supports the Extensible
MultiModal Annotation 1.0 (EMMA) standard. By providing a standardized
yet extensible and flexible basis for representing user input, we
believe EMMA has tremendous potential for making possible a wide
variety of innovative multimodal applications. In particular, EMMA
provides strong support for applications based on user inputs in human
language in many modalities, including speech, text and handwriting as
well as visual modalities such as sign languages.  EMMA also
supports composite multimodal interactions in which several user
inputs in two or more modalities are integrated to represent a single
user intent.
 
The Conversational Technologies EMMA implementations are used in
tutorials on commercial applications of natural language processing
and spoken dialog systems.  We report on two implementations. The
first is an EMMA producer (NLWorkbench) which is used to illustrate
statistical and grammar-based semantic analysis of speech and text
inputs. The second implementation is an EMMA consumer, specifically a
viewer for EMMA documents. The viewer can be used in the classroom to
simplify examination of EMMA results as well as potentially in
commercial applications for debugging spoken dialog systems. In
addition, the viewer could also become the basis of an editor which
would support such applications as human annotation of EMMA documents
to be used as input to machine learning applications. For most of the
EMMA structural elements the viewer simply provides a tree structure
mirroring the XML markup. The most useful aspects of the viewer are
probably the graphical representation for EMMA lattices, the ability
to see timestamps as standard dates and the computed durations from
EMMA timestamps. The two implementations will be made available in the
near future as open source software.
 
Deborah Dahl, Conversational Technologies

Received on Thursday, 2 October 2008 19:57:11 UTC