W3C home > Mailing lists > Public > www-multimodal@w3.org > August 2007

Re: emma:node anchoring on signal time axis

From: Paolo Martini <paolo.martini.relex@chello.be>
Date: Mon, 13 Aug 2007 23:10:23 +0200
Message-Id: <6fd5fca419fb9513f4ae4042c98bd5f0@chello.be>
To: <www-multimodal@w3.org> <www-multimodal@w3.org> <www-multimodal@w3.org>, "JOHNSTON, MICHAEL J (MICHAEL J)" <johnston@research.att.com>
Dear Michael and all,
as I do not represent the interest of any particular vendor or 
organisation, I am satisfied with your acknowledgment on the different 
and pleased if my feedback has been of any use. I will follow with 
interest the evolution of the specification.
Best regards,

   Paolo Martini

Le 13-août-07, à 22:33, JOHNSTON, MICHAEL J (MICHAEL J) a écrit :

> Dear Paolo Martini,
> Thanks again for your detailed comments on the EMMA specification.
> Your suggestions are particularly thoughtful and valuable and so to the
> extent possible we will incorporate these suggestions into the current
> draft of the specification, even though they fall outside the official 
> last call period.
> Regarding your comments on the language of the scope statement in
> Section 1:
> >>  1. Is "composed inputs" the same of "composite input" defined in 
> the Terminology?
> >>  The language is focussed on annotating single inputs from users, 
> which may be
> >>  either from a single mode or a composite input combining
> >>  information from multiple modes, as opposed to information that 
> might have been
> >>  collected over multiple turns of a dialog. The language provides a 
> set of elements
> >>  and attributes that are focused on enabling annotations
> >>  on user inputs and interpretations of those inputs.
> >>
> >>  2. Following the Terminology at 1.2, "interpretation" appears to 
> be signified
> >>  and "user input" the signifier. "annotation" seems to be used a 
> bit ambiguously:
> >>  it is my understanding that the common significance of 
> "annotation" is the association
> >>  - act or product of the act - of content, at any 'meta-' level, to 
> a region of a signal.
> >>  "accurately representing annotations on the input interpretations" 
> seems therefore to
> >>  refer to content associated to the signified (as indeed do 
> attributes like emma:cost,
> >>  emma:process, etc.). Nevertheless, EMMA seems to provide also a 
> way to represent the
> >>  interpretation itself - mainly with literals - besides being open 
> to include any
> >>  application specific representation, but it is not clear if it is 
> an accessory
> >>  function or a foundation one. Together with "Interpretations of 
> user input are said to
> >>  be derived from that input, and higher levels interpretations may 
> be derived from lower
> >>  level ones. EMMA allows you to reference the user input or 
> interpretation a given
> >>  interpretation was derived from", the issue becomes even more 
> evident when comparing
> >>  the expressive power of emma:interpretation and emma:arc to 
> represent alternative
> >>  interpretations (as advertised in "Lattices provide a compact 
> representation of large
> >>  lists of possible recognition results or interpretations"):
> >>  - an emma:interpretation has an "id" while an emma:arc needs 
> "from", "to" and the
> >>  signified to be identified (i.e. difficult reification of 
> emma:arcs)
> >>  - an emma:interpretation has emma:tokens to refer to user input, 
> while an emma:arc
> >>  can only refer to a time region of the input signal
> Thank you for your feedback, we agree that the use of EMMA to annotate 
> aspects both of
> the original signal and the interpretation can be made more clear, as 
> well as
> what we mean by 'composed inputs'. We have incorporated your 
> suggestions and revised the
> introductory informative paragraph in Section 1 as follows:
> "The language is focused on annotating single inputs from users,
> which may be either from a single mode or a composite input combining 
> information
> from multiple modes, as opposed to information that might have been 
> collected over
> multiple turns of a dialog. The language provides a set of elements 
> and attributes
> that are focused on enabling annotations on both user inputs
> and interpretations of those inputs."
> Regarding the addition of further lattice examples:
> >>  I would suggest adding examples lattice representation of 
> two-level analysis, like
> >>  a vocal /bo/+?+/ton/ to orthographic "Boston" and alternative 
> "Bolton"
> >>  and then a "Boston" and "Bolton" to the semantic interpretations 
> "BOS" and "TZR"
> >>  How does the second lattice refer to "Boston" and "Bolton" of the 
> first lattice?
> >>  Maybe, the situation could be addressed with emma:interpretation, 
> but then the
> >>  whole idea of the lattice as a compact representation fails. And 
> anyway, while
> >>  emma:tokens could refer to the orthographic "Boston" and "Bolton", 
> I cannot
> >>  find an IDREF attribute to refer to the "id" of an 
> "emma:interpretation".
> Over the years that the specification has been developed we have not 
> received
> any requests or use cases from vendors or developers for the ability 
> to cross
> reference from portions of one lattice to another. We agree that
> this capability could potentially be useful, especially for more 
> complex data annotation use cases,
> however for the following three reasons we choose not to incorporate 
> this
> suggestion into the current draft: 1. More detailed use cases from 
> multiple
> vendors are required in order motivate the addition of this 
> functionality.
> 2. A longer and more detailed and comprehensive study of how the EMMA 
> lattice
> representation can be extended to enable more complex kinds of 
> annotation use
> cases is required before incorporation into the standard. This work 
> goes beyond the
> scope of the current EMMA effort and should be postponed for 
> consideration for
> future versions of EMMA. 3. Since emma:arc can contain application 
> namespace
> data and emma:info is available for application and vendor specific 
> annotations,
> nothing in the current specification actively prevents this kinds of 
> annotation.
> Regarding the specification of the emma: namespace prefix on 
> attributes which are
> specific to EMMA elements, such as emma:arc, emma:derived-from:
> >>  3. About the attributes: "id","from","to", etc. are indicated 
> without "emma:", but,
> >>  lacking other namespace indication, will end up inheriting the NS 
> of the element to
> >>  which they belong, that is "emma:". A more consistent indication 
> could help reading the specification.
> It is well supported practice within XML and W3C specifications
> to not have to repeat over and over the namespace for attributes 
> specific
> to an element within a namespace. In the interests of reducing the 
> verbosity
> and increasing the readability of the examples in the EMMA 
> specification
> we are consistently not using the EMMA namespace for attributes.
> Regarding xlink:href:
> > 4. By the way, could "ref" be replaced by a common xlink:href ?
> A number of different EMMA elements use an "ref" attribute to 
> reference a
> URI (emma:grammar, emma:model). In order to capture the specific 
> semantics
> and constraints on EMMA markup we use an element specific "ref" 
> attribute
> rather than xlink:href. For example, emma:model requires either "ref" 
> or an
> inline specification of the model.
> Regarding the suggested extensions to the emma:arc element:
> >>  I am sorry to be so picky - though I hope at least relevant - on 
> what could be minor details of
> >>  this version, but I am worried that, at the moment of future 
> extensions of the scope, they could
> >>  force major changes or awkward solutions to maintain backward 
> compatibility.
> >>
> >>  I would therefore suggest the following additions:
> >>
> >>  - Allow optional "id" attribute to emma:arc (and evaluate using it 
> also to replace "node-number").
> >>
> >>  - Allow optional emma:tokens to emma:arc.
> >> 
> >>  - Allow optional emma:process to emma:arc.
> >>  - Add an attribute of type IDREF to reference emma:id and allow it 
> wherever useful.
> >> 
> >>  Add to emma:arc a type- or class-like attribute, with default to 
> an empty string
> >>  or null value allowing to consider of the same "type" all the arcs 
> without
> >>  the attribute. This could be a solution to a compact 
> representation of the two-level
> >>  analysis previously described (ex. type="orthographic" and 
> type="semantic") but it would
> >>  allow also experimenting on the already discussed integration with 
> other annotation types
> >>  without asking modifications to consumers following the current 
> draft (assuming they ignore,
> >>  as they should, attributes they don't understand).
> Thanks for your detailed feedback. These are excellent suggestions for 
> a richer and more powerful
> lattice annotation mechanism. However, given the lack of detailed use 
> cases, requests from
> vendors for this capability, and the need for a far longer and more 
> detailed investigation
> of how to use lattice representations for more complex annotations 
> tasks we defer consideration
> of these additions to a future version of the specification. It is 
> important to note that
> much of what is described here can currently be captured using 
> annotations within
> emma:info (which is permitted within emma:arc) and the ability to 
> place annotations within
> the application namespace data for. For example, in case of a word 
> lattice from speech
> recognition there is no need for emma:tokens since the label on the 
> arc will be
> the token. In the case of meaning lattice, EMMA attributes and 
> annotations in other
> namespaces can appear directly on the application namespace data. If 
> specific vendors wish
> to use a type classification to identify particular sets of arcs this 
> can be achieved
> using annotations in emma:info, on application namespace markup within 
> emma:arc, or using
> an attribute in another namespace on emma:arc itself.
> Thanks again for all of your comments. We think that a number of 
> aspects of the
> specification have been clarified and improved as a result of your 
> feedback.
> We would greatly appreciate it if you could respond within the next
> week indicating if this response addresses your concerns so that we
> can move forward with the specification.
> best
> Michael Johnston
> On behalf of W3C Multimodal working group
Received on Monday, 13 August 2007 21:11:01 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:06:34 UTC