- From: <johnston@research.att.com>
- Date: Tue, 26 Jun 2007 08:31:10 -0400
- To: <public-i18n-core@w3.org>, <www-multimodal@w3.org>
- Message-ID: <0C50B346CAD5214EA8B5A7C0914CF2A448575B@njfpsrvexg3.research.att.com>
I18N-2 ACCEPT
====================================================================
http://lists.w3.org/Archives/Public/www-multimodal/2007May/0005.html
SUBSTANTIVE (xml:lang vs emma:lang)
Comment from the i18n review of:
http://www.w3.org/TR/2007/WD-emma-20070409/
Comment 2
At http://www.w3.org/International/reviews/0704-emma/
Editorial/substantive: S
Owner: RI
Location in reviewed document:
4.2.5 [http://www.w3.org/TR/2007/WD-emma-20070409/#s4.2.5]
Comment:
It's not at all clear to us what the difference is between
emma:lang and xml:lang, the relationship between them, or
when we should use which. (It might help to create examples
that show the use of xml:lang as well as emma:lang.)
[[In order handle inputs involving multiple languages, such as through
code switching,
the emma:lang tag MAY contain several language identifiers separated by
spaces.]]
This is definitely something you cannot do with xml:lang, but we are
wondering what
is the value of doing it anyway. We are not sure what benefit it would
provide.
RESPONSE:
We address each of these two points in turn:
Point 1: ACCEPT Clarification of emma:lang vs xml:lang function
The W3C multimodal working group accept that it is important to
make clear the differences between the xml:lang and emma:lang
attributes and plan to add clarificatory text into the emma:lang
section in the next draft of the EMMA specification. The
xml:lang and emma:lang attributes serve uniquely different and
equally important purposes. The role of xml:lang is to
indicate the language used for content in an XML element or
document. In contrast, the emma:lang attribute is used to
indicate the language employed by a user when entering an
input into a spoken or multimodal dialog system. Critically,
emma:lang annotates the language of the signal originating
from the user rather than the specific tokens used at a
particular stage of processing. This is most clearly
illustrated through consideration of an example involving,
multiple stages of processing of a user input -- the primary
use of EMMA markup. Consider the following scenario:
EMMA is being used to represent three stages in the
processing of a spoken input to an system for ordering
products. The user input is in Italian, after speech
recognition, the user input is first translated into
English, then a natural language understanding system converts
the English translation into a product ID (which is not in any
particular language). Since the input signal is a user
speaking Italian, the emma:lang will be emma:lang="it" on all of
these stages of processing. The xml:lang attribute, in contrast
will initial be "it", after translation the xml:lang will
be "en-US", and after language understanding "zxx", assuming the
use of "zxx" to indicate non-linguistic content.
The following table illustrates the relation between the
content in the EMMA document, the emma:lang and the xml:lang:
------------------------------------------------------------------------
--------------------------
CONTENT: emma:lang xml:lang
processing stage
------------------------------------------------------------------------
--------------------------
condizionatore emma:lang="it" xml:lang="it" result
from speech recognition
air conditioner emma:lang="it" xml:lang="en" result
from machine translation
id1456 emma:lang="it" xml:lang="zxx" result
from natural language understanding
The following are examples of EMMA documents corresponding to these
three
processing stages. Abbreviated to show the critical attributes for
discussion here.
Note that <transcription>, <translation>, and <understanding> are
application
namespace attributes, not part of the EMMA markup.
<emma:emma>
<emma:interpretation emma:lang="it" emma:mode="voice"
emma:medium="acoustic">
<transcription
xml:lang="it">condizionatore</transcription>
</emma:interpretation>
</emma:emma>
<emma:emma>
<emma:interpretation emma:lang="it" emma:mode="voice"
emma:medium="acoustic">
<translation xml:lang="en">air
conditioner</translation>
</emma:interpretation>
</emma:emma>
<emma:emma>
<emma:interpretation emma:lang="it" emma:mode="voice"
emma:medium="acoustic">
<understanding
xml:lang="zxx">id1456</understanding>
</emma:interpretation>
</emma:emma>
In order to make clear these differences we will add clarifying text and
examples
to the specification.
Point 2: Clarification, multiple values in emma:lang:
-----------------------------------------------------
In call center and other applications multilingual users provide
inputs in which they switch input language in mid utterance. The
emma:lang in these cases needs to indicate that the language
involved more than one language, e.g.
"quisiera hacer una collect call"
The emma:lang in this case would have value "sp en"
<emma:emma>
<emma:interpretation emma:lang="sp en" emma:mode="voice"
emma:medium="acoustic">
<transcription>quisiera hacer una collect
call</transcription>
</emma:interpretation>
</emma:emma>
In order to use xml:lang in this example perhaps an additional element
could be used, e.g. <span>. Would this work?
<emma:emma>
<emma:interpretation emma:lang="sp en" emma:mode="voice"
emma:medium="acoustic">
<transcription xml:lang="sp">quisiera hacer una
<span xml:lang="en">collect call</span></transcription>
</emma:interpretation>
</emma:emma>
Received on Tuesday, 26 June 2007 12:33:20 UTC