W3C home > Mailing lists > Public > xml-editor@w3.org > October to December 1999

XML spec error -- Reference to IANA

From: Misha Wolf <misha.wolf@reuters.com>
Date: Thu, 11 Nov 1999 18:33:23 +0000 (GMT)
Message-Id: <199911111833.NAA27952@tux.w3.org>
To: xml-editor@w3.org
Cc: Jane Hunter <jane@dstc.edu.au>, w3c i18n ig <w3c-i18n-ig@w3.org>
Jane Hunter has pointed out the following problem in the XML specification:

Section 2.12, Language Identification, contains:

   The Langcode may be any of the following:

   -  ...

   -  a language identifier registered with the Internet Assigned Numbers 
      Authority [IANA] ...

   -  ...

The string "[IANA]" links to:

   http://www.w3.org/TR/REC-xml#IANA

which contains text related to character sets, not language codes:

   IANA
      (Internet Assigned Numbers Authority) Official Names for Character 
      Sets, ed. Keld Simonsen et al.  See 
      ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets.

Misha

[This mail was written using voice recognition software]

-----

> Thanks Misha,
> 
> One reason why this is so confusing is that the link in the XML Spec to IANA 
> language identifiers takes you to IANA character sets:
> 
> http://www.w3.org/TR/REC-xml#NT-LanguageID
> http://www.w3.org/TR/REC-xml#IANA
> 
> jane
> 
> > >But when I read through the XML spec it appears that you can have either the 
> > >2-letter language code or the IANA character set code - not both. Can you 
> > >confirm this and if its correct, why have they done this?
> > 
> > You are confusing two quite different things:
> > 
> > 1.  Language.  Each sentence, or word, or even character, of an XML 
> >     document may have a different language, indicated using an xml:lang 
> >     attribute, see:
> >        http://www.w3.org/TR/REC-xml#sec-lang-tag
> > 
> > 2.  Character set encoding.  An entire XML document must be encoded the 
> >     same way.  This is indicated using an encoding declaration, see:
> >        http://www.w3.org/TR/REC-xml.html#NT-EncodingDecl
> >     If the XML document is encoded using UTF-8 or UTF-16 then the 
> >     encoding declaration may be omitted.
> > 
> > >Is it possible to 
> > >define all three attributes using xml:lang 
> > 
> > No.  See above.
> > 
> > > or do we need to define a new 
> > >structure?
> > 
> > No.  See above.
> > 
> > If you want more information, you may do any of the following:
> > 
> > -  Mail the W3C's public Internationalisation mailing list 
> >    (www-international@w3.org).
> > 
> > -  If you are employed by a member of the W3C, join the W3C's 
> >    Internationalisation Interest Group by mailing the W3C I18N IG 
> >    Chair, Martin Dürst (duerst@w3.org).
> > 
> > -  If you are employed by a member of the W3C, join the W3C's 
> >    Internationalisation Working Group by getting your W3C Advisory 
> >    Committee representative to mail the W3C I18N WG Chair, 
> >    Misha Wolf (misha.wolf@reuters.com).
> > 
> > Misha
> > 
> > [This mail was written using voice recognition software]
> > 
> > 
> > -----------------------------------------------------------------
> >         Visit our Internet site at http://www.reuters.com
> > 
> > Any views expressed in this message are those of  the  individual
> > sender,  except  where  the sender specifically states them to be
> > the views of Reuters Ltd.
> > 
> 

-----------------------------------------------------------------
        Visit our Internet site at http://www.reuters.com

Any views expressed in this message are those of  the  individual
sender,  except  where  the sender specifically states them to be
the views of Reuters Ltd.
Received on Thursday, 11 November 1999 13:33:58 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:30 GMT