- From: JOHNSTON, MICHAEL J (MICHAEL J) <johnston@research.att.com>
- Date: Mon, 13 Aug 2007 16:33:38 -0400
- To: <www-multimodal@w3.org>, "Paolo Martini" <paolo.martini.relex@chello.be>
- Message-ID: <0C50B346CAD5214EA8B5A7C0914CF2A45CFCAA@njfpsrvexg3.research.att.com>
Dear Paolo Martini, Thanks again for your detailed comments on the EMMA specification. Your suggestions are particularly thoughtful and valuable and so to the extent possible we will incorporate these suggestions into the current draft of the specification, even though they fall outside the official last call period. Regarding your comments on the language of the scope statement in Section 1: >> 1. Is "composed inputs" the same of "composite input" defined in the Terminology? >> The language is focussed on annotating single inputs from users, which may be >> either from a single mode or a composite input combining >> information from multiple modes, as opposed to information that might have been >> collected over multiple turns of a dialog. The language provides a set of elements >> and attributes that are focused on enabling annotations >> on user inputs and interpretations of those inputs. >> >> 2. Following the Terminology at 1.2, "interpretation" appears to be signified >> and "user input" the signifier. "annotation" seems to be used a bit ambiguously: >> it is my understanding that the common significance of "annotation" is the association >> - act or product of the act - of content, at any 'meta-' level, to a region of a signal. >> "accurately representing annotations on the input interpretations" seems therefore to >> refer to content associated to the signified (as indeed do attributes like emma:cost, >> emma:process, etc.). Nevertheless, EMMA seems to provide also a way to represent the >> interpretation itself - mainly with literals - besides being open to include any >> application specific representation, but it is not clear if it is an accessory >> function or a foundation one. Together with "Interpretations of user input are said to >> be derived from that input, and higher levels interpretations may be derived from lower >> level ones. EMMA allows you to reference the user input or interpretation a given >> interpretation was derived from", the issue becomes even more evident when comparing >> the expressive power of emma:interpretation and emma:arc to represent alternative >> interpretations (as advertised in "Lattices provide a compact representation of large >> lists of possible recognition results or interpretations"): >> - an emma:interpretation has an "id" while an emma:arc needs "from", "to" and the >> signified to be identified (i.e. difficult reification of emma:arcs) >> - an emma:interpretation has emma:tokens to refer to user input, while an emma:arc >> can only refer to a time region of the input signal Thank you for your feedback, we agree that the use of EMMA to annotate aspects both of the original signal and the interpretation can be made more clear, as well as what we mean by 'composed inputs'. We have incorporated your suggestions and revised the introductory informative paragraph in Section 1 as follows: "The language is focused on annotating single inputs from users, which may be either from a single mode or a composite input combining information from multiple modes, as opposed to information that might have been collected over multiple turns of a dialog. The language provides a set of elements and attributes that are focused on enabling annotations on both user inputs and interpretations of those inputs." Regarding the addition of further lattice examples: >> I would suggest adding examples lattice representation of two-level analysis, like >> a vocal /bo/+?+/ton/ to orthographic "Boston" and alternative "Bolton" >> and then a "Boston" and "Bolton" to the semantic interpretations "BOS" and "TZR" >> How does the second lattice refer to "Boston" and "Bolton" of the first lattice? >> Maybe, the situation could be addressed with emma:interpretation, but then the >> whole idea of the lattice as a compact representation fails. And anyway, while >> emma:tokens could refer to the orthographic "Boston" and "Bolton", I cannot >> find an IDREF attribute to refer to the "id" of an "emma:interpretation". Over the years that the specification has been developed we have not received any requests or use cases from vendors or developers for the ability to cross reference from portions of one lattice to another. We agree that this capability could potentially be useful, especially for more complex data annotation use cases, however for the following three reasons we choose not to incorporate this suggestion into the current draft: 1. More detailed use cases from multiple vendors are required in order motivate the addition of this functionality. 2. A longer and more detailed and comprehensive study of how the EMMA lattice representation can be extended to enable more complex kinds of annotation use cases is required before incorporation into the standard. This work goes beyond the scope of the current EMMA effort and should be postponed for consideration for future versions of EMMA. 3. Since emma:arc can contain application namespace data and emma:info is available for application and vendor specific annotations, nothing in the current specification actively prevents this kinds of annotation. Regarding the specification of the emma: namespace prefix on attributes which are specific to EMMA elements, such as emma:arc, emma:derived-from: >> 3. About the attributes: "id","from","to", etc. are indicated without "emma:", but, >> lacking other namespace indication, will end up inheriting the NS of the element to >> which they belong, that is "emma:". A more consistent indication could help reading the specification. It is well supported practice within XML and W3C specifications to not have to repeat over and over the namespace for attributes specific to an element within a namespace. In the interests of reducing the verbosity and increasing the readability of the examples in the EMMA specification we are consistently not using the EMMA namespace for attributes. Regarding xlink:href: > 4. By the way, could "ref" be replaced by a common xlink:href ? A number of different EMMA elements use an "ref" attribute to reference a URI (emma:grammar, emma:model). In order to capture the specific semantics and constraints on EMMA markup we use an element specific "ref" attribute rather than xlink:href. For example, emma:model requires either "ref" or an inline specification of the model. Regarding the suggested extensions to the emma:arc element: >> I am sorry to be so picky - though I hope at least relevant - on what could be minor details of >> this version, but I am worried that, at the moment of future extensions of the scope, they could >> force major changes or awkward solutions to maintain backward compatibility. >> >> I would therefore suggest the following additions: >> >> - Allow optional "id" attribute to emma:arc (and evaluate using it also to replace "node-number"). >> >> - Allow optional emma:tokens to emma:arc. >> >> - Allow optional emma:process to emma:arc. >> - Add an attribute of type IDREF to reference emma:id and allow it wherever useful. >> >> Add to emma:arc a type- or class-like attribute, with default to an empty string >> or null value allowing to consider of the same "type" all the arcs without >> the attribute. This could be a solution to a compact representation of the two-level >> analysis previously described (ex. type="orthographic" and type="semantic") but it would >> allow also experimenting on the already discussed integration with other annotation types >> without asking modifications to consumers following the current draft (assuming they ignore, >> as they should, attributes they don't understand). Thanks for your detailed feedback. These are excellent suggestions for a richer and more powerful lattice annotation mechanism. However, given the lack of detailed use cases, requests from vendors for this capability, and the need for a far longer and more detailed investigation of how to use lattice representations for more complex annotations tasks we defer consideration of these additions to a future version of the specification. It is important to note that much of what is described here can currently be captured using annotations within emma:info (which is permitted within emma:arc) and the ability to place annotations within the application namespace data for. For example, in case of a word lattice from speech recognition there is no need for emma:tokens since the label on the arc will be the token. In the case of meaning lattice, EMMA attributes and annotations in other namespaces can appear directly on the application namespace data. If specific vendors wish to use a type classification to identify particular sets of arcs this can be achieved using annotations in emma:info, on application namespace markup within emma:arc, or using an attribute in another namespace on emma:arc itself. Thanks again for all of your comments. We think that a number of aspects of the specification have been clarified and improved as a result of your feedback. We would greatly appreciate it if you could respond within the next week indicating if this response addresses your concerns so that we can move forward with the specification. best Michael Johnston On behalf of W3C Multimodal working group
Received on Monday, 13 August 2007 20:34:52 UTC