W3C home > Mailing lists > Public > www-international@w3.org > July to September 2005

Re: Fw: FYI .. New article for REVIEW: xml:lang in XML document schemas

From: Felix Sasaki <fsasaki@w3.org>
Date: Fri, 19 Aug 2005 13:59:36 +0900
To: "Addison Phillips" <addison.phillips@quest.com>, "Uma Umamaheswaran" <umavs@ca.ibm.com>, www-international@w3.org
Cc: ishida@w3.org, "Sandy Gao" <sandygao@ca.ibm.com>, duerst@it.aoyama.ac.jp
Message-ID: <op.svqn5mzlx1753t@ibm-60d333fc0ec>

I always thought that the problem the I18N WG addressed is the word  
"intent" in the XML 1.0 spec and the non-realization of this intention in  
the infoset spec(s). I know that the FAQ is about the fact that xml:lang  
is not supposed to be used as a language-as-a-value mechanism, but it  
might be worth to note the xml 1.0 vs. infoset problem, with a reference  
to the two specs. And that is what Sandy's comment (I guess) and my  
(obviously unclear) try of a rewording was about. A reference to the  
lang() fuction of QT is of course no solution to the language-as-a-value  
purpose of lang values, so it seems to be misleading.

-- Felix

> On Fri, 19 Aug 2005 13:26:37 +0900, Addison Phillips  
> <addison.phillips@quest.com> wrote:

> No, it might be that the word "interpreted" is wrongly used here. The  
> actual quote from XML 1.0e3 is:
> --
> The intent declared with xml:lang is considered to apply to all  
> attributes and content of the element where it is specified, unless  
> overridden with an instance of xml:lang on another element within that  
> content. In particular, the empty value of xml:lang is used on an  
> element B to override a specification of xml:lang on an enclosing  
> element A, without specifying another language. Within B, it is  
> considered that there is no language information available, just as if  
> xml:lang had not been specified on B or any of its ancestors.
> --
> Read the meaning of the word "intent" in the first sentence above  
> carefully. It applies to the interpretation of the language tag itself,  
> not the application of that value to element contents. That value, it  
> says plainly, applies to all attributes and contents of the element,  
> *including* any contained elements. It is very clear that the  
> description in the FAQ is an accurate reflection of the above.
> Sandy Gao's analysis is also accurate, though: the Infosets spec, etc.  
> don't say anything about xml:lang. Apparently, while xml:lang should be  
> considered as a "normal" attribute from the point of view of a  
> processor, xml:lang's meaning is well established and it does indeed  
> have scope, which has snarky implications, which are imperfectly dealt  
> with. The I18N WG has commented on this to groups such as XQuery and so  
> forth in the past. For example, see:
> http://www.w3.org/International/2005/02/xq-xt-datamodel-review.html
> and also comment 7 in:
> http://lists.w3.org/Archives/Member/w3c-i18n-ig/2003Jul/0035.html
> Another way to interpret this is exactly as Sandy Gao states, which is  
> that a processor (capable of) interpreting xml:lang in some useful way  
> should apply it over the entire scope of the element, but that "normal"  
> XML processing is not affected.
> Indeed, this FAQ is to point out that xml:lang is precisely the wrong  
> vehicle for carrying language-as-a-value. It is metadata about content  
> that may be used to affect natural language processing and presentation.
> Best Regards,
> Addison
> Addison P. Phillips
> Globalization Architect, Quest Software
> Chair, W3C Internationalization Core Working Group
> Internationalization is not a feature.
> It is an architecture.
>> -----Original Message-----
>> From: www-international-request@w3.org [mailto:www-international-
>> request@w3.org] On Behalf Of Felix Sasaki
>> Sent: 2005年8月18日 19:55
>> To: Uma Umamaheswaran; www-international@w3.org
>> Cc: ishida@w3.org; Sandy Gao; duerst@w3.org
>> Subject: Re: Fw: FYI .. New article for REVIEW: xml:lang in XML document
>> schemas
>> Hi Sandy, hi all,
>> >> This could be of interest to our XML schema folks ..
>> >
>> > Schema (and XML Infoset) currently have no special treatment for
>> > xml:lang:
>> > it's just a normal attribute and appear in the infoset in the same way
>> as
>> > other attributes.
>> I guess the critical part of the article is:
>> "The xml:lang value applies to any sub-elements contained by the  
>> element.
>> It also applies to attribute values associated with the element and
>> sub-elements (though using natural language in attributes is not best
>> practice). "
>> This could be changed to
>> "The xml:lang value can be interpreted as applying to any sub-elements
>> contained by the element. It also can be interpreted as applying to
>> attribute values associated with the element and sub-elements (though
>> using natural language in attributes is not best practice). This
>> interpretation is not provided by any XML data model (xml infoset, xml
>> schema, XPath 2.0 data model), and it must be verified by additional
>> processing. A facility for such processing might be the lang() function  
>> of
>> XQuery/XSLT 2.0 [1], which uses the XPath expression
>> (ancestor-or-self::*/@xml:lang)[last()] to gather language values from
>> ancestor element nodes or the current element node."
>> This is quite lengthy, but would that address your concern about the  
>> topic?
>> Regards, Felix
>> [1] http://www.w3.org/TR/2005/WD-xpath-functions-20050404/#func-lang
>> >
>> > This article suggests that "xml:lang" should be used to specify the
>> > language in which the XML is written, while other language stuff  
>> should
>> > be
>> > used as part of the value being transmitted.
>> >
>> > If the world adopts this suggestion, then a bunch of things  
>> could/should
>> > happen (to treat xml:lang specially, similar to the treatment to
>> > namespace
>> > declarations):
>> > - Special treatment for xml:lang in infoset (some special property?)
>> > - Special validation rule for xml:lang in schema
>> > - Special treatment in data binding specs (to ignore xml:lang)
>> > - ...
>> >
>> > I don't think any of these will happy easily. For schema, we've been
>> > discussing treating all/some of xml: attribute specially. We are
>> > currently
>> > leaning towards not to do that.
>> >
>> > This article also mentions about "inheriting xml:lang", which is
>> somewhat
>> > misleading. The XML spec talks about that xml:lang's "intent is
>> > considered
>> > to apply to the sub-tree". My reading of this is that xml:lang aware
>> > processor can use such info to do useful things. But that doesn't  
>> imply
>> > that xml:lang attribute is inherited by sub-elements in XML. The  
>> infoset
>> > spec certainly has no mention of inheriting any xml:lang related  
>> infoset
>> > property.
>> >
>> > Thanks,
>> > Sandy Gao
>> > XML Parser Development, IBM Canada
>> > (1-905) 413-3255
>> > sandygao@ca.ibm.com
>> >
>> >
Received on Friday, 19 August 2005 04:59:53 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:25 UTC