Re: XML Core WG needs input on xml:lang=""

Al Gilman scripsit:

> Perhaps we have reached a point where we should ask the people who 
> control the vocabulary wherein 'und' is an established entry.

As you command, so shall it be.  The (U.S.) Library of Congress, the
registration authority for ISO 639-2, spake thus:

#    The language code "und" is used "if the language associated with an
#   item cannot be determined" or "for works having textual content
#   consisting of arbitrary syllables, humming or other human-produced
#   sounds for which a language cannot be specified."--from MARC Code
#   List for Languages.

So in order for "und" to apply we must have a language, or at least
human-produced sounds of some sort.  C is out.  (The MARC list is the
direct ancestor of the ISO 639-2 list.)

> It is not yet clear to me that it is legitimate to distinguish
> between the knowledge states after observing a) no xml:lang attribute
> or b) an xml:lang="und" attribute.  The XML markup usually only tells
> us what it is that the markup tells us.  'und' is perhaps like "this space
> intentionally left blank."  It tells us explicitly that it is telling
> us nothing, or so it would seem.

No, it tells us that we have here language or paralanguage, but not
(to quote the MARC list again for the kinds of things to which language
codes are not applicable):

# instrumental or electronic music; sound recordings consisting of
# nonverbal sounds; audiovisual materials with no narration, printed titles,
# or subtitles; machine-readable data files consisting of machine languages
# or character codes.

-- 
John Cowan                                <jcowan@reutershealth.com>     
http://www.reutershealth.com              http://www.ccil.org/~cowan
Yakka foob mog.  Grug pubbawup zink wattoom gazork.  Chumble spuzz.
    -- Calvin, giving Newton's First Law "in his own words"

Received on Saturday, 3 August 2002 11:48:21 UTC