- From: <johnston@research.att.com>
- Date: Tue, 26 Jun 2007 08:31:10 -0400
- To: <public-i18n-core@w3.org>, <www-multimodal@w3.org>
- Message-ID: <0C50B346CAD5214EA8B5A7C0914CF2A448575B@njfpsrvexg3.research.att.com>
I18N-2 ACCEPT ==================================================================== http://lists.w3.org/Archives/Public/www-multimodal/2007May/0005.html SUBSTANTIVE (xml:lang vs emma:lang) Comment from the i18n review of: http://www.w3.org/TR/2007/WD-emma-20070409/ Comment 2 At http://www.w3.org/International/reviews/0704-emma/ Editorial/substantive: S Owner: RI Location in reviewed document: 4.2.5 [http://www.w3.org/TR/2007/WD-emma-20070409/#s4.2.5] Comment: It's not at all clear to us what the difference is between emma:lang and xml:lang, the relationship between them, or when we should use which. (It might help to create examples that show the use of xml:lang as well as emma:lang.) [[In order handle inputs involving multiple languages, such as through code switching, the emma:lang tag MAY contain several language identifiers separated by spaces.]] This is definitely something you cannot do with xml:lang, but we are wondering what is the value of doing it anyway. We are not sure what benefit it would provide. RESPONSE: We address each of these two points in turn: Point 1: ACCEPT Clarification of emma:lang vs xml:lang function The W3C multimodal working group accept that it is important to make clear the differences between the xml:lang and emma:lang attributes and plan to add clarificatory text into the emma:lang section in the next draft of the EMMA specification. The xml:lang and emma:lang attributes serve uniquely different and equally important purposes. The role of xml:lang is to indicate the language used for content in an XML element or document. In contrast, the emma:lang attribute is used to indicate the language employed by a user when entering an input into a spoken or multimodal dialog system. Critically, emma:lang annotates the language of the signal originating from the user rather than the specific tokens used at a particular stage of processing. This is most clearly illustrated through consideration of an example involving, multiple stages of processing of a user input -- the primary use of EMMA markup. Consider the following scenario: EMMA is being used to represent three stages in the processing of a spoken input to an system for ordering products. The user input is in Italian, after speech recognition, the user input is first translated into English, then a natural language understanding system converts the English translation into a product ID (which is not in any particular language). Since the input signal is a user speaking Italian, the emma:lang will be emma:lang="it" on all of these stages of processing. The xml:lang attribute, in contrast will initial be "it", after translation the xml:lang will be "en-US", and after language understanding "zxx", assuming the use of "zxx" to indicate non-linguistic content. The following table illustrates the relation between the content in the EMMA document, the emma:lang and the xml:lang: ------------------------------------------------------------------------ -------------------------- CONTENT: emma:lang xml:lang processing stage ------------------------------------------------------------------------ -------------------------- condizionatore emma:lang="it" xml:lang="it" result from speech recognition air conditioner emma:lang="it" xml:lang="en" result from machine translation id1456 emma:lang="it" xml:lang="zxx" result from natural language understanding The following are examples of EMMA documents corresponding to these three processing stages. Abbreviated to show the critical attributes for discussion here. Note that <transcription>, <translation>, and <understanding> are application namespace attributes, not part of the EMMA markup. <emma:emma> <emma:interpretation emma:lang="it" emma:mode="voice" emma:medium="acoustic"> <transcription xml:lang="it">condizionatore</transcription> </emma:interpretation> </emma:emma> <emma:emma> <emma:interpretation emma:lang="it" emma:mode="voice" emma:medium="acoustic"> <translation xml:lang="en">air conditioner</translation> </emma:interpretation> </emma:emma> <emma:emma> <emma:interpretation emma:lang="it" emma:mode="voice" emma:medium="acoustic"> <understanding xml:lang="zxx">id1456</understanding> </emma:interpretation> </emma:emma> In order to make clear these differences we will add clarifying text and examples to the specification. Point 2: Clarification, multiple values in emma:lang: ----------------------------------------------------- In call center and other applications multilingual users provide inputs in which they switch input language in mid utterance. The emma:lang in these cases needs to indicate that the language involved more than one language, e.g. "quisiera hacer una collect call" The emma:lang in this case would have value "sp en" <emma:emma> <emma:interpretation emma:lang="sp en" emma:mode="voice" emma:medium="acoustic"> <transcription>quisiera hacer una collect call</transcription> </emma:interpretation> </emma:emma> In order to use xml:lang in this example perhaps an additional element could be used, e.g. <span>. Would this work? <emma:emma> <emma:interpretation emma:lang="sp en" emma:mode="voice" emma:medium="acoustic"> <transcription xml:lang="sp">quisiera hacer una <span xml:lang="en">collect call</span></transcription> </emma:interpretation> </emma:emma>
Received on Tuesday, 26 June 2007 12:33:30 UTC