W3C home > Mailing lists > Public > www-international@w3.org > July to September 2005

Re: Fw: FYI .. New article for REVIEW: xml:lang in XML document schemas

From: Sandy Gao <sandygao@ca.ibm.com>
Date: Sun, 21 Aug 2005 09:59:49 -0400
To: "Felix Sasaki" <fsasaki@w3.org>
Cc: "Addison Phillips" <addison.phillips@quest.com>, duerst@it.aoyama.ac.jp, ishida@w3.org, Uma Umamaheswaran <umavs@ca.ibm.com>, www-international@w3.org
Message-ID: <OFF8656202.F9C4F29B-ON85257064.001390A4-85257064.004CE6E0@ca.ibm.com>
Thanks to Felix and Addison for your responses.

I am in no way an expert on this specific issue and I don't have the 
necessary background around it, so please forgive my naive questions.

One thing I wasn't clear about was the real intent of this article. 
Whether it solely targets schema/xml authors to help them use xml:lang 
wisely, or this is the first step towards helping the industry (including 
existing standards) move to a state where xml:lang is *properly* handled?

If it's the former (and the article conveys such intent clearly), then any 
wording would seem OK to me (given my background). But my fear is that 
it's not that simple. From your responses, I get the feeling that you are 
not entirely happy about what's going on in Infoset (and possibly 
schema?). From reading the article, I get the feeling that some readers 
may wish more to be done. So I think it's important to clarify the 
intent/scope of this article, as well as the intent/scope 
Sorry I can't provide an alternative wording. (I'm not even an expert in 
English. :p) But something along the lines of what Addison said would be a 
good start:

> that a processor (capable of) interpreting xml:lang in some useful way 
> should apply it over the entire scope of the element, but that "normal" 
> XML processing is not affected.

(and will/should not be affected.) So the essence is that xml:lang is 
dealt with at a higher level than infoset.

BTW, my other comment was about the "inheritance" of xml:lang. I think 
this needs some clarification as well. To accurately reflect the current 
situation, we should make it clear that what's inherited is the semantics 
of xml:lang, not the attribute (an AII in infoset) itself.

Thanks,
Sandy Gao
XML Parser Development, IBM Canada
(1-905) 413-3255
sandygao@ca.ibm.com




"Felix Sasaki" <fsasaki@w3.org> 
08/19/2005 12:59 AM

To
"Addison Phillips" <addison.phillips@quest.com>, Uma 
Umamaheswaran/Toronto/IBM@IBMCA, www-international@w3.org
cc
ishida@w3.org, Sandy Gao/Toronto/IBM@IBMCA, duerst@it.aoyama.ac.jp
Subject
Re: Fw: FYI .. New article for REVIEW: xml:lang in XML document schemas






I always thought that the problem the I18N WG addressed is the word 
"intent" in the XML 1.0 spec and the non-realization of this intention in 
the infoset spec(s). I know that the FAQ is about the fact that xml:lang 
is not supposed to be used as a language-as-a-value mechanism, but it 
might be worth to note the xml 1.0 vs. infoset problem, with a reference 
to the two specs. And that is what Sandy's comment (I guess) and my 
(obviously unclear) try of a rewording was about. A reference to the 
lang() fuction of QT is of course no solution to the language-as-a-value 
purpose of lang values, so it seems to be misleading.

-- Felix

> On Fri, 19 Aug 2005 13:26:37 +0900, Addison Phillips 
> <addison.phillips@quest.com> wrote:

> No, it might be that the word "interpreted" is wrongly used here. The 
> actual quote from XML 1.0e3 is:
>
> --
> The intent declared with xml:lang is considered to apply to all 
> attributes and content of the element where it is specified, unless 
> overridden with an instance of xml:lang on another element within that 
> content. In particular, the empty value of xml:lang is used on an 
> element B to override a specification of xml:lang on an enclosing 
> element A, without specifying another language. Within B, it is 
> considered that there is no language information available, just as if 
> xml:lang had not been specified on B or any of its ancestors.
> --
>
> Read the meaning of the word "intent" in the first sentence above 
> carefully. It applies to the interpretation of the language tag itself, 
> not the application of that value to element contents. That value, it 
> says plainly, applies to all attributes and contents of the element, 
> *including* any contained elements. It is very clear that the 
> description in the FAQ is an accurate reflection of the above.
>
> Sandy Gao's analysis is also accurate, though: the Infosets spec, etc. 
> don't say anything about xml:lang. Apparently, while xml:lang should be 
> considered as a "normal" attribute from the point of view of a 
> processor, xml:lang's meaning is well established and it does indeed 
> have scope, which has snarky implications, which are imperfectly dealt 
> with. The I18N WG has commented on this to groups such as XQuery and so 
> forth in the past. For example, see:
>
> http://www.w3.org/International/2005/02/xq-xt-datamodel-review.html
>
> and also comment 7 in:
>
> http://lists.w3.org/Archives/Member/w3c-i18n-ig/2003Jul/0035.html
>
> Another way to interpret this is exactly as Sandy Gao states, which is 
> that a processor (capable of) interpreting xml:lang in some useful way 
> should apply it over the entire scope of the element, but that "normal" 
> XML processing is not affected.
>
> Indeed, this FAQ is to point out that xml:lang is precisely the wrong 
> vehicle for carrying language-as-a-value. It is metadata about content 
> that may be used to affect natural language processing and presentation.
>
> Best Regards,
>
> Addison
>
> Addison P. Phillips
> Globalization Architect, Quest Software
> Chair, W3C Internationalization Core Working Group
>
> Internationalization is not a feature.
> It is an architecture.
>
>> -----Original Message-----
>> From: www-international-request@w3.org [mailto:www-international-
>> request@w3.org] On Behalf Of Felix Sasaki
>> Sent: 2005年8月18日 19:55
>> To: Uma Umamaheswaran; www-international@w3.org
>> Cc: ishida@w3.org; Sandy Gao; duerst@w3.org
>> Subject: Re: Fw: FYI .. New article for REVIEW: xml:lang in XML 
document
>> schemas
>>
>>
>> Hi Sandy, hi all,
>>
>>
>> >> This could be of interest to our XML schema folks ..
>> >
>> > Schema (and XML Infoset) currently have no special treatment for
>> > xml:lang:
>> > it's just a normal attribute and appear in the infoset in the same 
way
>> as
>> > other attributes.
>>
>> I guess the critical part of the article is:
>>
>> "The xml:lang value applies to any sub-elements contained by the 
>> element.
>> It also applies to attribute values associated with the element and
>> sub-elements (though using natural language in attributes is not best
>> practice). "
>>
>> This could be changed to
>>
>> "The xml:lang value can be interpreted as applying to any sub-elements
>> contained by the element. It also can be interpreted as applying to
>> attribute values associated with the element and sub-elements (though
>> using natural language in attributes is not best practice). This
>> interpretation is not provided by any XML data model (xml infoset, xml
>> schema, XPath 2.0 data model), and it must be verified by additional
>> processing. A facility for such processing might be the lang() function 
 
>> of
>> XQuery/XSLT 2.0 [1], which uses the XPath expression
>> (ancestor-or-self::*/@xml:lang)[last()] to gather language values from
>> ancestor element nodes or the current element node."
>>
>> This is quite lengthy, but would that address your concern about the 
>> topic?
>>
>> Regards, Felix
>>
>> [1] http://www.w3.org/TR/2005/WD-xpath-functions-20050404/#func-lang
>>
>> >
>> > This article suggests that "xml:lang" should be used to specify the
>> > language in which the XML is written, while other language stuff 
>> should
>> > be
>> > used as part of the value being transmitted.
>> >
>> > If the world adopts this suggestion, then a bunch of things 
>> could/should
>> > happen (to treat xml:lang specially, similar to the treatment to
>> > namespace
>> > declarations):
>> > - Special treatment for xml:lang in infoset (some special property?)
>> > - Special validation rule for xml:lang in schema
>> > - Special treatment in data binding specs (to ignore xml:lang)
>> > - ...
>> >
>> > I don't think any of these will happy easily. For schema, we've been
>> > discussing treating all/some of xml: attribute specially. We are
>> > currently
>> > leaning towards not to do that.
>> >
>> > This article also mentions about "inheriting xml:lang", which is
>> somewhat
>> > misleading. The XML spec talks about that xml:lang's "intent is
>> > considered
>> > to apply to the sub-tree". My reading of this is that xml:lang aware
>> > processor can use such info to do useful things. But that doesn't 
>> imply
>> > that xml:lang attribute is inherited by sub-elements in XML. The 
>> infoset
>> > spec certainly has no mention of inheriting any xml:lang related 
>> infoset
>> > property.
>> >
>> > Thanks,
>> > Sandy Gao
>> > XML Parser Development, IBM Canada
>> > (1-905) 413-3255
>> > sandygao@ca.ibm.com
>> >
>> >
>>
>>
>
>
Received on Sunday, 21 August 2005 14:00:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:05 GMT