W3C home > Mailing lists > Public > public-i18n-geo@w3.org > July 2005

[ESW Wiki] Update of "geoFAQxmllang" by 64.105.174.187

From: <w3t-archive+esw-wiki@w3.org>
Date: Wed, 20 Jul 2005 15:46:41 -0000
To: w3t-archive+esw-wiki@w3.org
Message-ID: <20050720154641.17892.85467@localhost.localdomain>
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "ESW Wiki" for change notification.

The following page has been changed by 64.105.174.187:
http://esw.w3.org/topic/geoFAQxmllang


The comment on the change is:
initial edit to put the changes from GEO call into the document

------------------------------------------------------------------------------
  
   When should I use ''xml:lang'' and when should I define my own element or attribute for passing language values in an XML document schema (DTD)?
  
- == Answer ==
+ == Background ==
  
- Sometimes documents contain or reference different types of natural language content. Other times they need to describe natural language as a value in a data structure. Because there are different ways of referencing natural languages in XML documents, users are sometimes confused whether they should use ''xml:lang'' or define their own language-related element or attribute. 
+ Sometimes documents contain or reference different types of natural language content. Other times they need to describe natural language as a value in a data structure. Because there are different ways of referencing natural languages in XML documents, schema designers are sometimes confused whether they should use ''xml:lang'' or define their own language-related element or attribute. 
- 
- === When to use ''xml:lang'' ===
- 
- XML 1.0 defines a common attribute ''xml:lang'' which identifies the language of text or other content (including embedded objects such as an image or sound file) contained by the element in which it appears. The ''xml:lang'' value applies to any sub-elements contained by the element and attribute values associated with the element and its descendant elements also are associated with the ''xml:lang'' (though using natural language in attributes is '''not''' best practice). The value of the ''xml:lang'' attribute is a language tag defined by RFC 3066 or its successor.
  
  For example, in XHTML 1.0, there is an ''hreflang'' attribute in the <a> element and also an ''xml:lang'' (or ''lang'' attribute, in the case of HTML 4.0) for the content of the <a> element:
  
@@ -30, +26 @@

     Click for German
  </a>}}}
  
- In this example, the text inside the <a> element is identified as being in English (xml:lang="en"). A different example from XHTML 1.0 shows how ''xml:lang'' applies to an attribute:
+ The ''xml:lang'' attribute describes the language contained by the ''<a>'' element ("Click for German"), while the ''hreflang'' attribute is meta-data, in this case describing the language of some content external to this Web page.
  
- {{{<abbr title="radio detection and ranging" xml:lang="en">
-    RADAR
+ == Answer ==
+ 
+ === When to use ''xml:lang'' ===
+ 
+ Content directly associated with the XML document (either contained within the document directly or considered part of the document when it is processed or rendered) should use the ''xml:lang'' attribute to indicate the language. ''xml:lang'' should be reserved for content authors to directly label any natural language content they may have.
+ 
+ ''xml:lang'' is defined by XML 1.0 as a common attribute that can be used to indicate the language of any element's contents. This includes any human readable text, as well as other content (such as embedded objects like images or sound files) contained by the element in which it appears. The ''xml:lang'' value applies to any sub-elements contained by the element. It also applies to attribute values associated with the element and sub-elements (though using natural language in attributes is '''not''' best practice). The value of the ''xml:lang'' attribute is a language tag defined by RFC 3066 or its successor.
+ 
+ For example, here is ''xml:lang'' on an element {{{<t>}}}:
+ 
+ {{{<t xml:lang="en">
+    This is some text contained by the &gt;t&lt; element. The use
+    of the xml:lang attribute indicates the language so that, for
+    example, the correct font could be applied when rendered or
+    the correct spell-checker could be used when proofing the
+    document. If we didn't have xml:lang, we might have problems
+    with embeded content, such as the phrase <span xml:lang="fr">
+    C'est la vie</span>, which is in another language.
+ </t>}}}
+ 
+ This example from XHTML 1.0 shows how ''xml:lang'' applies to an attribute:
+ 
+ {{{<abbr title="simple object access protocol" xml:lang="en">
+    SOAP
  </abbr>}}}
  
- This is a good example of why applying ''xml:lang'' to an attribute is not desirable: there is no way to supply more than one language of the {{{title}}} attribute!
+ Applying ''xml:lang'' to an attribute is not desirable: there is no way to supply more than one language of the {{{title}}} attribute, or to separate the language used in the attribute from that used in the element. Consider:
+ 
+ {{{
+ <p xml:lang="fr"><span title="anglais"><a href="qa-css-charset.en.html" lang="en" xml:lang="en">English</a></span></p>
+ }}}
  
  === When to use your own element or attribute ===
  
- Sometimes you need to convey a language as an information item of its own. For example, if you created an XML document describing your DVD collection, you might want an element to indicate what languages are available on the soundtrack portion of each disc. Or if you were creating a customer database, you might have a field for the customer's language preference. In these cases you want to store the language ''value'' as an element or attribute. You still want to use RFC 3066 (or its successor) to form the value, but you should define an element or attribute of your own with a different name and not use the ''xml:lang'' attribute. This is because the item you are describing is not a piece of content in or referenced directly by the XML document. Instead it is data or meta-data related to that item.
+ When the language value is really an attribute of or metadata about some external content, then ''xml:lang'' is not an appropriate choice. In these cases you want to store language information, but the language doesn't refer to the content of the XML document (or external content, such as images, which are processed as part of the document) directly. In this case you should define an element or attribute of using a different name and not use the ''xml:lang'' attribute. The value of the element or attribute should use RFC 3066 (or its successor), just like ''xml:lang''.
+ 
+ Some examples of this might include:
+   * an element in an XML document describing your DVD collection to indicate which languages are available on the soundtrack
+   * an element in a customer database with a field for the customer's language preference
+   * an attribute of a link element (such as {{<a>}} in XHTML) pointing to a translation of this document into another language
+ 
+ The reason you would choose to create your own element (or attribute) is to convey the language as a value--as part of a data structure or as meta-data about an external document--rather than to indicate the language of a specific piece of content. Avoiding the use of ''xml:lang'' to describe external language values avoids creating problems for content authors who need to label content for processing purposes.
  
  For example, an XML document might look like this:
  
- {{{<item>
+ {{{<item type="DVD">
-   <title xml:lang="en">Casablanca</title> <!-- indicates the language of of the text 'Casablanca' -->
+   <title xml:lang="en">Casablanca</title>    <!-- indicates the language of of the text 'Casablanca' -->
-   <runningTime value="137" /> <!-- not language affected -->
+   <runningTime value="137" />                <!-- not language affected -->
-   <dialogue language="zh-HK" /> <!-- indicates a language value (attribute) of the dialogue element -->
-   <subtitles track="1" language="zh-Hant" />
+   <dialogue>zh-HK</dialogue>                 <!-- indicates the language of the dialogue -->
+   <subtitles track="1" language="zh-Hant" /> <!-- this track contains Traditional Chinese subtitles -->
    <subtitles track="2" language="zh-Hans" /> 
  </item>}}}
  
- In this example, the ''xml:lang'' attribute conveys information about the natural language of text appearing in this document. The ''language'' attribute is defined in the XML document schema for the elements <dialogue> and <subtitles> and conveys a natural language ''value'' associated with these elements. For example, it conveys the information that the subtitles on Track #1 are written or displayed in Traditional Chinese ("zh-Hant").
+ In this example, the ''xml:lang'' attribute conveys information about the natural language of text appearing in this document. The ''dialogue'' element and the ''language'' attribute of the ''subtitles'' element are defined in the XML document schema and convey a natural language value associated with these items. For example, it conveys the information that the subtitles on Track #1 are written or displayed in Traditional Chinese ("zh-Hant").
- 
- In addition, while it is possible to define your own formats for all the various values that you need, it is sometimes helps interoperability to define formats using a shared vocabulary, such as XML Schema. XML Schema provides a type for language values (xsi:language) which is defined using RFC 3066.
- 
  
  == By the way ==
  
- It's important to remember that ''xml:lang'' has scope. This can be used to identify the language for a lot of content (without having redundant language tags on every element). For example, it is good practice to put ''xml:lang'' into your {{{<html>}}} element at the start of an XHTML document:
+ It's important to remember that ''xml:lang'' has scope. This can be used to identify the language for a lot of content (without having redundant language tags on every element). For example, it is good practice to put ''xml:lang'' into your {{{<html>}}} element at the start of an XHTML document. For more information, see [http://www.w3.org/International/articles/language-tags/].
  
- {{{<html
-   xmlns="http://www.w3.org/1999/xhtml" 
-   lang="en"
-   xml:lang="en">
- }}}
- 
- Consider the following example document:
- 
- {{{
-   <a xml:lang="en">
-      <b>example 1</b>
-      <c xml:lang="">example 2</c>
-      <d>example 3</d>
-      <e attr="example 4">en-US</e>
-      <f xml:lang="de">example 5</f>
-   </a>
- }}}
- 
- In the example, the contents of elements <b> and <d> and the attribute ''attr'' of element <e> are all tagged as being in English ('en'). The content of element <c> is tagged with the "empty language" (that is, the language is specifically not identified or the text is not in a natural language: this is an example of unsetting the value of ''xml:lang''). The content of element <f> is tagged as being in German ('de'). Element <e> itself contains what appears to be a language tag ("en-US"): in that case its content is not in English, but rather conveys the value of "U.S. English".
- 
Received on Wednesday, 20 July 2005 18:32:44 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:12:40 GMT