Re: [EMMA] i18n comment: Use of xml:lang=""

I18N-4:  ACCEPT with modification

========================================================================
==========

http://lists.w3.org/Archives/Public/www-multimodal/2007May/0002.html
SUBSTANTIVE

 

Comment from the i18n review of:

http://www.w3.org/TR/2007/WD-emma-20070409/

 

Comment 4

At http://www.w3.org/International/reviews/0704-emma/

Editorial/substantive: S

Owner: RI

 

Location in reviewed document:

4.2.5 [http://www.w3.org/TR/2007/WD-emma-20070409/#s4.2.5]

 

Comment: 

In XML 1.0 you can indicate the lack of 

language information using xml:lang="". How does EMMA allow for that
with xml:lang 

and emma:lang? We feel it ought to. See 

http://www.w3.org/International/questions/qa-no-language

 

 

Response:

 

ACCEPT (with modification)

 

Thank you for raising this important issue. In addressing this issue and

reading related documents such as
(http://www.w3.org/International/questions/qa-no-language),

we determined that in addition to the use of emma:lang="" we should also
address the

use of emma:lang="zxx". Below we address each in turn:

 

1. Non-linguistic input (emma:lang="zxx"):

------------------------------------------

Given the use of EMMA for capturing multimodal input, including input

using pen/ink, sensors, computer vision etc there are many EMMA results

that capture non-linguistic input. Example include drawing areas, arrows
etc.

on maps and music input for tune recognition. This raises the question
of

how non-linguistic inputs should be annotated for emma:lang. Following
on from

the use in xml:lang, we propose that non-linguistic input should be
marked

using the value "zxx". Since we already refer to BCP 47 and use the
values from the 

IANA subtag registry for emma:lang values this does not require revision
of the 

EMMA markup. We will however, add an example and clarifying text to the
EMMA

specification indicating the use of emma:lang="zxx" for non-linguistic
inputs.

 

To illustrate the difference between emma:lang and xml:lang for this
kind of

case. Hummed input to a tune recognition application would be
emma:lang="zxx"

since the input is not in a human language, but it the result was a 

song title in English, that would be marked as xml:lang="en":

 

<emma:emma>

            <emma:interpretation emma:lang="zxx" emma:mode="tune"
emma:medium="acoustic"> 

                        <songtitle xml:lang="en">another one bites the
dust</songtitle>

            </emma:interpretation>

</emma:emma>

 

 

2. Non-specification (emma:lang="")

-----------------------------------

Parallel to your suggested usage for xml:lang

(http://www.w3.org/International/questions/qa-no-language), 

for cases in which there is no information about

whether the source input is in a particular human language and if so
which

language, are annotated as emma:lang="". 

 

Furthermore, in cases where there is not explicit emma:lang

annotation, and none is inherited from a higher element in the

document, the default value for emma:lang is "" meaning

that there is no information about whether the source 

input is in a language and if so which language.

Received on Tuesday, 26 June 2007 12:37:43 UTC