- From: Martin J. Duerst <duerst@w3.org>
- Date: Wed, 22 Dec 1999 13:00:03 +0900
- To: Joseph Reagle <reagle@alum.mit.edu>
- Cc: Michel Bazieu <michel.bazieu@CNEN.DE.EdF.Fr>, www-international@w3.org
Forwarded by the list maintainer. At 17:45 1999/12/21 -0500, Joseph Reagle wrote: > [I've cc'd this to www-international@w3.org since people on that list may be better able to address XML schema internationalization issues that are at the bottom of this message.] > > At 19:38 99/12/20 +0100, Michel Bazieu wrote: > >Hello, > >I read your note with great interest, as the use of > >XML/XSL/RDF/DOM/etc.. for web applications is a subject that puzzles > >many of my clients.(I'm a web developper for dynamic content sites) > > Hi Michel, thanks for the comments! > > >The title of your note makes reference to the Eskimo language (though > >Inuit is the polite term) > > I've been in conversations about this before! <smile> The use of the term "Eskimo" can be a bit contentious, but in writing my paper I am quite confident that my diction is correct and non-offensive. In linguistics, the term Eskimo is used to refer to a family of languages, of which Inuit is one, "The Eskimo languages proper can be divided into two branches, Yupik and Inuit or Inupiaq." [1] The issues of what is considered to be offense is briefly explained at [2], but there is more extensive (and interesting) discussion on the use of these words and "Eskimo Snow" in general at [3]. My use of the terms is in the context of the term as seminally used Geoff Pullum's, "The Great Eskimo Vocabulary Hoax." > > [1] http://www.alaskool.org/language/inupiaqhb/Inupiaq_Handbook.htm#The Eskimo泡leut language family > [2] http://www.uaf.edu/anlc/inuitoreskimo.html > [3] http://linguistlist.org/issues/5/5-1239.html > > >but I have the feeling that when you write > >about natural language, it means *english* language. > > I unfortunately only speak English. However, I was careful to use 'natural language' as defined in the technical sense: > > natural language > <application> A language spoken or written by humans, as > opposed to a language use to program or communicate with > computers. Natural language understanding is one of the hardest > problems of artificial intelligence due to the complexity, > irregularity and diversity of human language and the philosophical > problems of meaning. > http://www.nightflight.com/cgi-bin/foldoc.cgi?query=natural+language > > >It seems to me that the uniformization not only of technical syntax > >(desirable!) but also of semantics through the publication of consensus > >(translate: english) vocabularies is a potentially dangerous step. > >Even more so as it seems that these vocabularies will be created and > >controlled (like the w3c) solely by american corporations. > >Will the non-english sites be able to publish their content in their > >language thru the use of XML tags with any chance of being understood by > >english-speaking users (with the help of some concept translation > >device) or is XML/etc.. the utmost in unfair imperial business practice? > > Your points are well taken. However, I believe you can capture semantics in alternative schema definitions. I'm not an expert in internationalization issues, but the W3C works very actively in this domain (and I've cc'd Martin who is our point of contact on this topic and may have some thoughts.) Content negotiation capabilities are part of HTTP, though they are used infrequently I suspect. This sort of capability was also supported by PICS: > > be available in multiple languages, either through an existing > negotiation mechanism or through links to alternate language versions; ... > Unlike the name and description strings, transmission names are > language-independent. That is, if a rating system is offered in several > languages, the transmission names must be the same in all of them. > http://www.w3.org/TR/REC-PICS-services > > In XML, one can use the xml:lang attribute [4] to present alternative natural language declarations of an element's content in an XML instance. However, I am unsure of how (or if others think it beneficial) to have alternative language schemas and element types. (Such that a Chinese author won't have to learn what <meta> means.) One could have the alternative schema and use XSLT to translate back and forth I suppose. Perhaps others on the www-international@w3.org can answer this better than me. > > [4] http://www.w3.org/TR/REC-xml#sec-lang-tag > ___ > Sincerely, http://goatee.net > NrrrdBoy "They shall not make baldness upon their head, neither > shall they shave off the corner of their beard" - Leviticus 21:5. > > > #-#-# Martin J. Du"rst, World Wide Web Consortium #-#-# mailto:duerst@w3.org http://www.w3.org
Received on Tuesday, 21 December 1999 22:58:45 UTC