W3C home > Mailing lists > Public > www-international@w3.org > July to September 2005

RE: Fw: FYI .. New article for REVIEW: xml:lang in XML document schemas

From: Addison Phillips <addison.phillips@quest.com>
Date: Thu, 18 Aug 2005 21:26:37 -0700
Message-ID: <634978A7DF025A40BFEF33EB191E13BC0C816DAE@irvmbxw01.quest.com>
To: "Felix Sasaki" <fsasaki@w3.org>, "Uma Umamaheswaran" <umavs@ca.ibm.com>, <www-international@w3.org>
Cc: <ishida@w3.org>, "Sandy Gao" <sandygao@ca.ibm.com>, <duerst@it.aoyama.ac.jp>
No, it might be that the word "interpreted" is wrongly used here. The actual quote from XML 1.0e3 is:

--
The intent declared with xml:lang is considered to apply to all attributes and content of the element where it is specified, unless overridden with an instance of xml:lang on another element within that content. In particular, the empty value of xml:lang is used on an element B to override a specification of xml:lang on an enclosing element A, without specifying another language. Within B, it is considered that there is no language information available, just as if xml:lang had not been specified on B or any of its ancestors.
--

Read the meaning of the word "intent" in the first sentence above carefully. It applies to the interpretation of the language tag itself, not the application of that value to element contents. That value, it says plainly, applies to all attributes and contents of the element, *including* any contained elements. It is very clear that the description in the FAQ is an accurate reflection of the above.

Sandy Gao's analysis is also accurate, though: the Infosets spec, etc. don't say anything about xml:lang. Apparently, while xml:lang should be considered as a "normal" attribute from the point of view of a processor, xml:lang's meaning is well established and it does indeed have scope, which has snarky implications, which are imperfectly dealt with. The I18N WG has commented on this to groups such as XQuery and so forth in the past. For example, see:

http://www.w3.org/International/2005/02/xq-xt-datamodel-review.html


and also comment 7 in:

http://lists.w3.org/Archives/Member/w3c-i18n-ig/2003Jul/0035.html


Another way to interpret this is exactly as Sandy Gao states, which is that a processor (capable of) interpreting xml:lang in some useful way should apply it over the entire scope of the element, but that "normal" XML processing is not affected.

Indeed, this FAQ is to point out that xml:lang is precisely the wrong vehicle for carrying language-as-a-value. It is metadata about content that may be used to affect natural language processing and presentation.

Best Regards,

Addison

Addison P. Phillips
Globalization Architect, Quest Software
Chair, W3C Internationalization Core Working Group

Internationalization is not a feature.
It is an architecture. 

> -----Original Message-----
> From: www-international-request@w3.org [mailto:www-international-
> request@w3.org] On Behalf Of Felix Sasaki
> Sent: 2005年8月18日 19:55
> To: Uma Umamaheswaran; www-international@w3.org
> Cc: ishida@w3.org; Sandy Gao; duerst@w3.org
> Subject: Re: Fw: FYI .. New article for REVIEW: xml:lang in XML document
> schemas
> 
> 
> Hi Sandy, hi all,
> 
> 
> >> This could be of interest to our XML schema folks ..
> >
> > Schema (and XML Infoset) currently have no special treatment for
> > xml:lang:
> > it's just a normal attribute and appear in the infoset in the same way
> as
> > other attributes.
> 
> I guess the critical part of the article is:
> 
> "The xml:lang value applies to any sub-elements contained by the element.
> It also applies to attribute values associated with the element and
> sub-elements (though using natural language in attributes is not best
> practice). "
> 
> This could be changed to
> 
> "The xml:lang value can be interpreted as applying to any sub-elements
> contained by the element. It also can be interpreted as applying to
> attribute values associated with the element and sub-elements (though
> using natural language in attributes is not best practice). This
> interpretation is not provided by any XML data model (xml infoset, xml
> schema, XPath 2.0 data model), and it must be verified by additional
> processing. A facility for such processing might be the lang() function of
> XQuery/XSLT 2.0 [1], which uses the XPath expression
> (ancestor-or-self::*/@xml:lang)[last()] to gather language values from
> ancestor element nodes or the current element node."
> 
> This is quite lengthy, but would that address your concern about the topic?
> 
> Regards, Felix
> 
> [1] http://www.w3.org/TR/2005/WD-xpath-functions-20050404/#func-lang

> 
> >
> > This article suggests that "xml:lang" should be used to specify the
> > language in which the XML is written, while other language stuff should
> > be
> > used as part of the value being transmitted.
> >
> > If the world adopts this suggestion, then a bunch of things could/should
> > happen (to treat xml:lang specially, similar to the treatment to
> > namespace
> > declarations):
> > - Special treatment for xml:lang in infoset (some special property?)
> > - Special validation rule for xml:lang in schema
> > - Special treatment in data binding specs (to ignore xml:lang)
> > - ...
> >
> > I don't think any of these will happy easily. For schema, we've been
> > discussing treating all/some of xml: attribute specially. We are
> > currently
> > leaning towards not to do that.
> >
> > This article also mentions about "inheriting xml:lang", which is
> somewhat
> > misleading. The XML spec talks about that xml:lang's "intent is
> > considered
> > to apply to the sub-tree". My reading of this is that xml:lang aware
> > processor can use such info to do useful things. But that doesn't imply
> > that xml:lang attribute is inherited by sub-elements in XML. The infoset
> > spec certainly has no mention of inheriting any xml:lang related infoset
> > property.
> >
> > Thanks,
> > Sandy Gao
> > XML Parser Development, IBM Canada
> > (1-905) 413-3255
> > sandygao@ca.ibm.com
> >
> >
> 
> 


Received on Friday, 19 August 2005 04:26:50 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:05 GMT