W3C home > Mailing lists > Public > xml-editor@w3.org > July to September 2002

Re: XML Core WG needs input on xml:lang=""

From: Al Gilman <asgilman@iamdigex.net>
Date: Sat, 03 Aug 2002 10:43:22 -0400
Message-Id: <>
To: John Cowan <jcowan@reutershealth.com>
Cc: jcowan@reutershealth.com (John Cowan), w3c-xml-plenary@w3.org, w3c-i18n-ig@w3.org, xml-editor@w3.org, w3c-xml-core-wg@w3.org

At 12:11 PM 2002-08-02, John Cowan wrote:

>"Und" makes a statement
>that we are ignorant, but "" makes no statement at all.

Perhaps we have reached a point where we should ask the people who 
control the vocabulary wherein 'und' is an established entry.

It is not yet clear to me that it is legitimate to distinguish
between the knowledge states after observing a) no xml:lang attribute
or b) an xml:lang="und" attribute.  The XML markup usually only tells
us what it is that the markup tells us.  'und' is perhaps like "this space
intentionally left blank."  It tells us explicitly that it is telling
us nothing, or so it would seem.

On the other hand, the argument that Chris raises, it is easier
to program the comparison if "no statement" is denoted by "null string" 
rather than a reserved special string, is interesting.

On the other hand again, if the ISO vocabulary makes 'und' synonymous
with "no statement," if indeed 'und' is_a 'nil,' we are not honoring 
their semantics by failing to provide special case handling for 'und' 
in the comparison program.

Rick's point about taking a systematic approach to scoping and heredity 
is on point.  

A small version of this argument is that "Unsetting attributes is a generic
requirement for attributes that set hereditary properties."

A null string is not a generic solution for a nil meta-value for attributes.
It cannot be used in this way for string attributes where the "" indication 
of a zero-length string is a legal string value.  Note that in the case of 
the string attribute html:img.alt an absent attribute and a null string 
attribute value are two distinct cases to be handled differently sometimes.  
See "2.8 No repair text" in



>Al Gilman scripsit:
>> The intended sense of xml:lang="" is to clear any hereditary value for
>> this property and leave the scope in question with "no statement" as to
>> language.  That is IIRC the same semantics as 'nil' in XQuery.
>> >In any case, "und" is a side issue.
>> I don't yet understand this.  If a code fragment not to be interpreted
>> as English is embedded in a scope where xml:lang="en" and the code fragment
>> is wrapped in an entity where the attribute xml:lang="und" is set I don't 
>> see why we need to define another value to achieve the desired effect.
>> What is the use case for the distinction between setting
>> xml:lang="und"  -- as one already can, and
>> xml:lang-""     -- the proposed innovation
><p xml:lang="en">Here's some text constrasting the "und" and ""
>language tags:
>  <span xml:lang="und">Yakka foob mog.  Grug pubbawup zink wattoom gazork.
>    Chumble spuzz.</span>, which is Calvin's version of Newton's First Law, and
>   <span xml:lang=""><xi:include href="http://example.com/boilerplate"/></span>,
>   which is boilerplate text we are including.  If the boilerplate lacks a
>   language tag, we don't want to force it to be English.</p>
>> If, as you claim, xml:lang="und" applies in the case where the language is 
>> unknown, i.e. 'nil' semantics, then we don't necessarily need an 
>> indication different from this in order to escape from an 
>> enclosing/hereditary natural language property.
>"Und" stands for "undetermined", which may mean that the language is
>completely unidentified (e.g. the text is in Linear A), or that some
>process has not as yet determined what the language is (a book that
>nobody has read yet, e.g.).  "", on the other hand, simply means
>that we aren't providing language information, either because it doesn't
>exist (the text is not in a natural language), or because we don't want
>to impose a default when we don't know if it's right, or because our source
>provided no information, or for any other reason.  "Und" makes a statement
>that we are ignorant, but "" makes no statement at all.
>There is / One art                      John Cowan <jcowan@reutershealth.com>
>No more / No less                       http://www.reutershealth.com
>To do / All things                      http://www.ccil.org/~cowan
>With art- / Lessness                     -- Piet Hein
Received on Saturday, 3 August 2002 10:43:27 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:37:41 UTC