difference in semantics of lang attribute between XML and HTML

While reviewing the WS i18n usage scinario document, I have
noticed that there is a semantic difference between XML and HTML
language specs.

In section 8.1 of HTML 4.01 spec at
there is an example that reads:
	<P><Q lang="en">Her super-powers were the result of
	&gamma;-radiation,</Q> he explained.</P>

<HTML lang="fr">
...Interpreted as French...
<P lang="es">...Interpreted as Spanish...
<P>...Interpreted as French again...
<P>...French text interrupted by<EM lang="ja">some
         Japanese</EM>French begins here again...

From these examples and explanaition, the lang
attribute just aids the client system to do a better job by
giving it a "hint".  The various elements with different
lang attributes are *not* mutually exclusive.  All of them are
rendered.  This HTML fragment:

	<p lang="en-GB">What colour is it?</p>
	<p lang="en-US">What color is it?</p>

will result in both sentences to appear on screen.

But in XML, the lang attribute provides a mutually
exclusive, choice semantics, according to the description
and examples given in Section 2.12 of XML 1.0:
The screen would be empty if the above fragment
(substitute lang with xml:lang) is interpreted by XML
parser running in non English locale (or even in
en-CA locale!).

I wonder this deviation in semantics is a conscious 
decision made for XML. 

I also wonder if there should be a mechanism to
provide a fall back lang tag so that at least one
element is selected as a fall back when none of the 
alternative elements is chosen because of the
mismatch with lang attribute.  Has this been

T. "Kuro" Kurosaka, Internationalization Architect
IONA Technologies, Santa Clara, CA USA / +1 408 350-9684 

Received on Thursday, 2 January 2003 15:44:33 UTC