Re: XML Language Identifier (fwd)

Jane Hunter <jane@dstc.edu.au> wrote:

>The MPEG-7 people want to be able to describe the language of the MPEG-7 
>description using:
> - Language Code - code from ISO 639, RFC 1766
> - Country Code - code from ISO 3166
> - Character Set - IANA identifier

RFC 1766 language tags are made up of an ISO 639 language code plus an 
optional ISO 3166 country code, eg "en-us", "en-gb", etc.

>Ideally this would be covered using the XML Language identifier:
>http://www.w3.org/TR/REC-xml#sec-lang-tag

The XML Language identifier (xml:lang) uses RFC 1766 language tags.

>But when I read through the XML spec it appears that you can have either the 
>2-letter language code or the IANA character set code - not both. Can you 
>confirm this and if its correct, why have they done this?

You are confusing two quite different things:

1.  Language.  Each sentence, or word, or even character, of an XML 
    document may have a different language, indicated using an xml:lang 
    attribute, see:
       http://www.w3.org/TR/REC-xml#sec-lang-tag

2.  Character set encoding.  An entire XML document must be encoded the 
    same way.  This is indicated using an encoding declaration, see:
       http://www.w3.org/TR/REC-xml.html#NT-EncodingDecl
    If the XML document is encoded using UTF-8 or UTF-16 then the 
    encoding declaration may be omitted.

>Is it possible to 
>define all three attributes using xml:lang 

No.  See above.

> or do we need to define a new 
>structure?

No.  See above.

If you want more information, you may do any of the following:

-  Mail the W3C's public Internationalisation mailing list 
   (www-international@w3.org).

-  If you are employed by a member of the W3C, join the W3C's 
   Internationalisation Interest Group by mailing the W3C I18N IG 
   Chair, Martin Dürst (duerst@w3.org).

-  If you are employed by a member of the W3C, join the W3C's 
   Internationalisation Working Group by getting your W3C Advisory 
   Committee representative to mail the W3C I18N WG Chair, 
   Misha Wolf (misha.wolf@reuters.com).

Misha

[This mail was written using voice recognition software]


-----------------------------------------------------------------
        Visit our Internet site at http://www.reuters.com

Any views expressed in this message are those of  the  individual
sender,  except  where  the sender specifically states them to be
the views of Reuters Ltd.

Received on Wednesday, 10 November 1999 11:20:35 UTC