W3C home > Mailing lists > Public > xml-editor@w3.org > July to September 2002

Re: XML Core WG needs input on xml:lang=""

From: Al Gilman <asgilman@iamdigex.net>
Date: Fri, 02 Aug 2002 11:02:58 -0400
Message-Id: <>
To: John Cowan <jcowan@reutershealth.com>
Cc: jcowan@reutershealth.com (John Cowan), w3c-xml-plenary@w3.org, w3c-i18n-ig@w3.org, xml-editor@w3.org, w3c-xml-core-wg@w3.org

At 10:28 AM 2002-08-02, John Cowan wrote:
>It has the same semantics as not using an xml:lang tag at all.

Let me see if I understand.

The intended sense of xml:lang="" is to clear any hereditary value for
this property and leave the scope in question with "no statement" as to
language.  That is IIRC the same semantics as 'nil' in XQuery.

>In any case, "und" is a side issue.

I don't yet understand this.  If a code fragment not to be interpreted
as English is embedded in a scope where xml:lang="en" and the code fragment
is wrapped in an entity where the attribute xml:lang="und" is set I don't 
see why we need to define another value to achieve the desired effect.

What is the use case for the distinction between setting

xml:lang="und"  -- as one already can, and
xml:lang-""     -- the proposed innovation


If, as you claim, xml:lang="und" applies in the case where the language is 
unknown, i.e. 'nil' semantics, then we don't necessarily need an 
indication different from this in order to escape from an 
enclosing/hereditary natural language property.


>Al Gilman scripsit:
>> This assertion is fatuous.  Un-enforceably vague.
>Note that I corrected this paragraph in a follow-up posting.
>> The 'und' mark at least is well posed, if it means "one of the defined
>> language labels applies, but we don't know which."  This is a union type.
>No, it may also mean that no existing tag applies because the language is
>not known.  For example, the writing system Linear A records an unknown
>language.  Similarly, the language of an audio recording may not be known
>for a variety of reasons.
>> Distinguishing between 
>> a) a natural language for which there is no label registered
>> b) "not a natural language"
>> has no portable definition among different agents applying 'lang' attribute
>> values, and hence should not be presumed known by these agents.
>In any case, "und" is a side issue.
>> However, for practical purposes a 'nil' on 'lang' inside a natural-language
>> context will be sufficient to disabuse the processor of following the rules
>> of the natural language in the enclosing scope.
>The code "nil" is not currently assigned, but it is within the scope of
>the ISO 639-2 registration authority to assign it, so it cannot be
>used.  The code "" cannot be assigned by ISO 639-2.
>> Process question --
>> who defines the 'und' token?  Is this a meta-value defined in the IETF RFC,
>> or is this an invention of XSD Types or of XML?
>The ISO 639-2 registration authority, which underlies all the others
>you mention.
>> Introducing the suggested sense for the null string would appear to be a bad
>> idea on the grounds that the sense bound to this sign is ill-posed, not
>> interoperable.  So don't go there.
>It has the same semantics as not using an xml:lang tag at all.
>John Cowan                              <jcowan@reutershealth.com>
>http://www.reutershealth.com            http://www.ccil.org/~cowan
>                .e'osai ko sarji la lojban.
>                Please support Lojban!          http://www.lojban.org 
Received on Friday, 2 August 2002 11:03:04 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:37:41 UTC