- From: Paolo Martini <paolo.martini.relex@chello.be>
- Date: Mon, 6 Aug 2007 21:50:50 +0200
- To: <www-multimodal@w3.org> <www-multimodal@w3.org> <www-multimodal@w3.org>, "JOHNSTON, MICHAEL J (MICHAEL J)" <johnston@research.att.com>
- Message-Id: <79470eb885659e78d4b2bbe521ed5c44@chello.be>
Dear Michael and all,
I appreciate the intentionally well delimited scope of EMMA 1.0 and the
adequacy of the current draft to it.
In order to better clarify the description of that scope, I would
suggest rewording and elaborating the Introduction statement:
"The language is focused on annotating the interpretation information
of single and composed inputs, as opposed to (possibly identical)
information that might have been collected over the course of a dialog.
The language provides a set of elements and attributes that are focused
on accurately representing annotations on the input interpretations."
1. Is "composed inputs" the same of "composite input" defined in the
Terminology?
2. Following the Terminology at 1.2, "interpretation" appears to be
signified and "user input" the signifier. "annotation" seems to be used
a bit ambiguously: it is my understanding that the common significance
of "annotation" is the association - act or product of the act - of
content, at any 'meta-' level, to a region of a signal.
"accurately representing annotations on the input interpretations"
seems therefore to refer to content associated to the signified (as
indeed do attributes like emma:cost, emma:process, etc.). Nevertheless,
EMMA seems to provide also a way to represent the interpretation itself
- mainly with literals - besides being open to include any application
specific representation, but it is not clear if it is an accessory
function or a foundation one.
Together with "Interpretations of user input are said to be derived
from that input, and higher levels interpretations may be derived from
lower level ones. EMMA allows you to reference the user input or
interpretation a given interpretation was derived from", the issue
becomes even more evident when comparing the expressive power of
emma:interpretation and emma:arc to represent alternative
interpretations (as advertised in "Lattices provide a compact
representation of large lists of possible recognition results or
interpretations"):
- an emma:interpretation has an "id" while an emma:arc needs "from",
"to" and the signified to be identified (i.e. difficult reification of
emma:arcs)
- an emma:interpretation has emma:tokens to refer to user input, while
an emma:arc can only refer to a time region of the input signal
I would suggest adding examples lattice representation of two-level
analysis, like
a vocal /bo/+?+/ton/ to orthographic "Boston" and alternative "Bolton"
and then a "Boston" and "Bolton" to the semantic interpretations "BOS"
and "TZR"
How does the second lattice refer to "Boston" and "Bolton" of the first
lattice? Maybe, the situation could be addressed with
emma:interpretation, but then the whole idea of the lattice as a
compact representation fails. And anyway, while emma:tokens could refer
to the orthographic "Boston" and "Bolton", I cannot find an IDREF
attribute to refer to the "id" of an "emma:interpretation".
Furthermore, an emma:interpretation can specify emma:process while
emma:arc cannot.
3. About the attributes: "id","from","to", etc. are indicated without
"emma:", but, lacking other namespace indication, will end up
inheriting the NS of the element to which they belong, that is "emma:".
A more consistent indication could help reading the specification.
4. By the way, could "ref" be replaced by a common xlink:href ?
I am sorry to be so picky - though I hope at least relevant - on what
could be minor details of this version, but I am worried that, at the
moment of future extensions of the scope, they could force major
changes or awkward solutions to maintain backward compatibility.
I would therefore suggest the following additions:
- Allow optional "id" attribute to emma:arc (and evaluate using it also
to replace "node-number").
- Allow optional emma:tokens to emma:arc.
- Allow optional emma:process to emma:arc.
- Add an attribute of type IDREF to reference emma:id and allow it
wherever useful.
- Add to emma:arc a type- or class-like attribute, with default to an
empty string or null value allowing to consider of the same "type" all
the arcs without the attribute. This could be a solution to a compact
representation of the two-level analysis previously described (ex.
type="orthographic" and type="semantic") but it would allow also
experimenting on the already discussed integration with other
annotation types without asking modifications to consumers following
the current draft (assuming they ignore, as they should, attributes
they don't understand).
Best regards,
Paolo Martini
Le 03-août-07, à 19:29, JOHNSTON, MICHAEL J (MICHAEL J) a écrit :
>
> Dear Paolo Martini,
>
> Thank you for your detailed and thoughtful contributions on
> emma:lattice.
> The EMMA subgroup have discussed your comments in detail and formulated
> the following responses.
>
> Regarding your first point about the relative time mechanism on
> emma:node.
> We agree that like the absolute timestamps, the relative time stamp
> mechanism
> was also not intended to apply to emma:node and will remove relative
> timestamps from emma:node in the specification.
>
> Regarding the three different time axes you describe (input, model,
> output),
> the scope of the EMMA specification addresses only the input axis at
> this
> point in its development. In the longer term we hope to extend EMMA
> for representation
> of system output as well as user inputs but for EMMA 1.0 we address
> only input.
> Your comments regarding the output time axis are particularly relevant
> for output
> representation in EMMA and will provide valuable input for future
> versions of EMMA.
>
> Regarding the connections between emma:lattice representations and
> annotation
> graphs such as ATLAS, again this is a very good feedback. The initial
> intention behind
> emma:lattice is to capture and provide a standard representation for
> the graph outputs
> that vendors of speech recognition and other modality processing
> components
> currently provide in proprietary representations. Over the course of
> this work more use cases
> have come up and there is growing interest in the potential use of
> EMMA more broadly
> for annotation of speech corpora and other resources. The initial
> scope of EMMA is to provide
> a mechanism for communication among the components of interactive
> systems, such
> as spoken and multimodal dialog systems. In future versions of EMMA
> beyond 1.0 we
> hope to provide more support for annotation and corpus use cases, and
> your input on relations with annotation schemes such as ATLAS will be
> extremely
> valuable for that work.
>
> We would greatly appreciate it if you could respond within the next
> two weeks
> indicating whether this response addresses your concerns. Thanks again
> for
> such detailed feedback.
>
> best
> Michael Johnston
>
>
> On behalf of W3C Multimodal working group
>
>
Attachments
- text/enriched attachment: stored
Received on Monday, 6 August 2007 19:50:55 UTC