- From: Paolo Martini <paolo.martini.relex@chello.be>
- Date: Mon, 6 Aug 2007 21:50:50 +0200
- To: <www-multimodal@w3.org> <www-multimodal@w3.org> <www-multimodal@w3.org>, "JOHNSTON, MICHAEL J (MICHAEL J)" <johnston@research.att.com>
- Message-Id: <79470eb885659e78d4b2bbe521ed5c44@chello.be>
Dear Michael and all, I appreciate the intentionally well delimited scope of EMMA 1.0 and the adequacy of the current draft to it. In order to better clarify the description of that scope, I would suggest rewording and elaborating the Introduction statement: "The language is focused on annotating the interpretation information of single and composed inputs, as opposed to (possibly identical) information that might have been collected over the course of a dialog. The language provides a set of elements and attributes that are focused on accurately representing annotations on the input interpretations." 1. Is "composed inputs" the same of "composite input" defined in the Terminology? 2. Following the Terminology at 1.2, "interpretation" appears to be signified and "user input" the signifier. "annotation" seems to be used a bit ambiguously: it is my understanding that the common significance of "annotation" is the association - act or product of the act - of content, at any 'meta-' level, to a region of a signal. "accurately representing annotations on the input interpretations" seems therefore to refer to content associated to the signified (as indeed do attributes like emma:cost, emma:process, etc.). Nevertheless, EMMA seems to provide also a way to represent the interpretation itself - mainly with literals - besides being open to include any application specific representation, but it is not clear if it is an accessory function or a foundation one. Together with "Interpretations of user input are said to be derived from that input, and higher levels interpretations may be derived from lower level ones. EMMA allows you to reference the user input or interpretation a given interpretation was derived from", the issue becomes even more evident when comparing the expressive power of emma:interpretation and emma:arc to represent alternative interpretations (as advertised in "Lattices provide a compact representation of large lists of possible recognition results or interpretations"): - an emma:interpretation has an "id" while an emma:arc needs "from", "to" and the signified to be identified (i.e. difficult reification of emma:arcs) - an emma:interpretation has emma:tokens to refer to user input, while an emma:arc can only refer to a time region of the input signal I would suggest adding examples lattice representation of two-level analysis, like a vocal /bo/+?+/ton/ to orthographic "Boston" and alternative "Bolton" and then a "Boston" and "Bolton" to the semantic interpretations "BOS" and "TZR" How does the second lattice refer to "Boston" and "Bolton" of the first lattice? Maybe, the situation could be addressed with emma:interpretation, but then the whole idea of the lattice as a compact representation fails. And anyway, while emma:tokens could refer to the orthographic "Boston" and "Bolton", I cannot find an IDREF attribute to refer to the "id" of an "emma:interpretation". Furthermore, an emma:interpretation can specify emma:process while emma:arc cannot. 3. About the attributes: "id","from","to", etc. are indicated without "emma:", but, lacking other namespace indication, will end up inheriting the NS of the element to which they belong, that is "emma:". A more consistent indication could help reading the specification. 4. By the way, could "ref" be replaced by a common xlink:href ? I am sorry to be so picky - though I hope at least relevant - on what could be minor details of this version, but I am worried that, at the moment of future extensions of the scope, they could force major changes or awkward solutions to maintain backward compatibility. I would therefore suggest the following additions: - Allow optional "id" attribute to emma:arc (and evaluate using it also to replace "node-number"). - Allow optional emma:tokens to emma:arc. - Allow optional emma:process to emma:arc. - Add an attribute of type IDREF to reference emma:id and allow it wherever useful. - Add to emma:arc a type- or class-like attribute, with default to an empty string or null value allowing to consider of the same "type" all the arcs without the attribute. This could be a solution to a compact representation of the two-level analysis previously described (ex. type="orthographic" and type="semantic") but it would allow also experimenting on the already discussed integration with other annotation types without asking modifications to consumers following the current draft (assuming they ignore, as they should, attributes they don't understand). Best regards, Paolo Martini Le 03-août-07, à 19:29, JOHNSTON, MICHAEL J (MICHAEL J) a écrit : > > Dear Paolo Martini, > > Thank you for your detailed and thoughtful contributions on > emma:lattice. > The EMMA subgroup have discussed your comments in detail and formulated > the following responses. > > Regarding your first point about the relative time mechanism on > emma:node. > We agree that like the absolute timestamps, the relative time stamp > mechanism > was also not intended to apply to emma:node and will remove relative > timestamps from emma:node in the specification. > > Regarding the three different time axes you describe (input, model, > output), > the scope of the EMMA specification addresses only the input axis at > this > point in its development. In the longer term we hope to extend EMMA > for representation > of system output as well as user inputs but for EMMA 1.0 we address > only input. > Your comments regarding the output time axis are particularly relevant > for output > representation in EMMA and will provide valuable input for future > versions of EMMA. > > Regarding the connections between emma:lattice representations and > annotation > graphs such as ATLAS, again this is a very good feedback. The initial > intention behind > emma:lattice is to capture and provide a standard representation for > the graph outputs > that vendors of speech recognition and other modality processing > components > currently provide in proprietary representations. Over the course of > this work more use cases > have come up and there is growing interest in the potential use of > EMMA more broadly > for annotation of speech corpora and other resources. The initial > scope of EMMA is to provide > a mechanism for communication among the components of interactive > systems, such > as spoken and multimodal dialog systems. In future versions of EMMA > beyond 1.0 we > hope to provide more support for annotation and corpus use cases, and > your input on relations with annotation schemes such as ATLAS will be > extremely > valuable for that work. > > We would greatly appreciate it if you could respond within the next > two weeks > indicating whether this response addresses your concerns. Thanks again > for > such detailed feedback. > > best > Michael Johnston > > > On behalf of W3C Multimodal working group > >
Attachments
- text/enriched attachment: stored
Received on Monday, 6 August 2007 19:50:55 UTC