- From: Martin Duerst <duerst@w3.org>
- Date: Sun, 04 Aug 2002 21:28:59 +0900
- To: Al Gilman <asgilman@iamdigex.net>, John Cowan <jcowan@reutershealth.com>
- Cc: w3c-xml-plenary@w3.org, w3c-i18n-ig@w3.org, xml-editor@w3.org, w3c-xml-core-wg@w3.org
At 15:05 02/08/03 -0400, Al Gilman wrote: >At 11:45 AM 2002-08-03, John Cowan wrote: > >Al Gilman scripsit: > > > >> Perhaps we have reached a point where we should ask the people who > >> control the vocabulary wherein 'und' is an established entry. > > > >As you command, so shall it be. The (U.S.) Library of Congress, the > >registration authority for ISO 639-2, spake thus: > > > ># The language code "und" is used "if the language associated with an > ># item cannot be determined" or "for works having textual content > ># consisting of arbitrary syllables, humming or other human-produced > ># sounds for which a language cannot be specified."--from MARC Code > ># List for Languages. > > > >So in order for "und" to apply we must have a language, or at least > >human-produced sounds of some sort. > >I cannot concur with this interpretation of the quote. > >This quote is fine as far as it goes, which is to say that when the >language cannot be determined, and a language label is required, the >'und' label MUST be selected. > >But it doesn't resolve the issue. Hello Al, Please check out http://lists.w3.org/Archives/Member/w3c-i18n-ig/2002Apr/0112.html. where it says: >The MARC21 system which uses the same three-letter language codes as >ISO-632 has a provision that a blank value is used "when the item has no >sung, spoken, or written textual content." Examples given include >instrumental music or data files consisting of machine languages. The >language code "und" is used "if the language associated with an item >cannot be determined" or "for works having textual content consisting of >arbitrary syllables, humming or other human-produced sounds for which a >language cannot be specified."--from MARC Code List for Languages. > >Milicent Wewerka, Library of Congress > ># instrumental or electronic music; sound recordings consisting of > ># nonverbal sounds; audiovisual materials with no narration, printed titles, > ># or subtitles; machine-readable data files consisting of machine languages > ># or character codes. > >The purpose of that quote would appear to be to keep people from requesting >token assignments for machine languages. Not to keep people from applying >an 'und' mark to unknown situations, where the range of possibilities includes >machine language. The phrase 'has a provision that a blank value is used' clearly shows that an empty value is used in practice. >In this case reading the historical document is not a full replacement for >asking the current maintainers of the vocabulary. I have asked, and they have replied. Anyway, the purpose of this discussion is not to determine the use of xml:lang="und" and xml:lang="", for which the XML Core WG, based on the recommendations of the I18N WG, has already made a decision. The question posed here is: Should this change be an erratum to XML 1.0, or part of XML 1.1. My personal answer is that I would prefer an erratum. Regards, Martin.
Received on Sunday, 4 August 2002 08:56:15 UTC