W3C home > Mailing lists > Public > xml-editor@w3.org > July to September 2002

Re: XML Core WG needs input on xml:lang=""

From: John Cowan <jcowan@reutershealth.com>
Date: Fri, 2 Aug 2002 12:11:41 -0400 (EDT)
Message-Id: <200208021624.MAA08472@mail2.reutershealth.com>
To: asgilman@iamdigex.net (Al Gilman)
Cc: jcowan@reutershealth.com (John Cowan), w3c-xml-plenary@w3.org, w3c-i18n-ig@w3.org, xml-editor@w3.org, w3c-xml-core-wg@w3.org

Al Gilman scripsit:

> The intended sense of xml:lang="" is to clear any hereditary value for
> this property and leave the scope in question with "no statement" as to
> language.  That is IIRC the same semantics as 'nil' in XQuery.

Correct.

> >In any case, "und" is a side issue.
> 
> I don't yet understand this.  If a code fragment not to be interpreted
> as English is embedded in a scope where xml:lang="en" and the code fragment
> is wrapped in an entity where the attribute xml:lang="und" is set I don't 
> see why we need to define another value to achieve the desired effect.
> 
> What is the use case for the distinction between setting
> 
> xml:lang="und"  -- as one already can, and
> xml:lang-""     -- the proposed innovation

<p xml:lang="en">Here's some text constrasting the "und" and ""
language tags:
  <span xml:lang="und">Yakka foob mog.  Grug pubbawup zink wattoom gazork.
    Chumble spuzz.</span>, which is Calvin's version of Newton's First Law, and
   <span xml:lang=""><xi:include href="http://example.com/boilerplate"/></span>,
   which is boilerplate text we are including.  If the boilerplate lacks a
   language tag, we don't want to force it to be English.</p>

> If, as you claim, xml:lang="und" applies in the case where the language is 
> unknown, i.e. 'nil' semantics, then we don't necessarily need an 
> indication different from this in order to escape from an 
> enclosing/hereditary natural language property.

"Und" stands for "undetermined", which may mean that the language is
completely unidentified (e.g. the text is in Linear A), or that some
process has not as yet determined what the language is (a book that
nobody has read yet, e.g.).  "", on the other hand, simply means
that we aren't providing language information, either because it doesn't
exist (the text is not in a natural language), or because we don't want
to impose a default when we don't know if it's right, or because our source
provided no information, or for any other reason.  "Und" makes a statement
that we are ignorant, but "" makes no statement at all.

-- 
There is / One art                      John Cowan <jcowan@reutershealth.com>
No more / No less                       http://www.reutershealth.com
To do / All things                      http://www.ccil.org/~cowan
With art- / Lessness                     -- Piet Hein
Received on Friday, 2 August 2002 12:14:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:32 GMT