- From: Michele Diodati <michele.diodati@gmail.com>
- Date: Wed, 29 Dec 2004 00:09:07 +0100
- To: w3c-wai-gl@w3.org
On Mon, 27 Dec 2004 17:55:44 -0600, John M Slatin <john_slatin@austin.utexas.edu> wrote: > > Many documents contain words, phrases, or longer passages that are > in a different language than the language of the document as a whole. > The language of each "foreign" word, phrase, or longer passage must > be identified so that user agents, including assistive technology, can > present the text appropriately. I think this requirement is highly inapplicable for many reasons. 1. Web developers are not linguists. 2. Web developers very often aren't authors of the text in the web pages they publish on the Web, but they simply put together contents received from different sources. 3. Web developers can completely ignore which natural language a run of text, even a single word, is. 4. Natural languages are very complex phenomena: by no means they are reducible to a set of mathematical equations, in which you can say always, surely and perfectly "this is only English", "this is only Italian", "this is only Hebrew", etc. etc. 5. Even though you can specify unmistakebly the natural language of a run of text, it could be even worse for accessibility. Here are some examples, taken from the Italian language, of that complexity and of the ambiguous consequences of Guideline 3.1 L2 SC. a. The note at the bottom of Guideline 3.1 L2 SC3 currently says: "This does not include use of foreign words in text where such usage is a standard extension of the language". There are many words included in Italian vocabularies and dictionaries that are foreign words largely used. "File", taken from English, is one of these words. According to the above note, Italian web developers should not mark the word "file" as an English word, because it is a standard extension of the Italian language (if I understand what "a standard extension" means). However, "file" (=document) is homograph of "file" (=rows, lines). The latter is pronounced according to the classic Italian phonetic rules, while the former is pronounced according to rules trying to simulate the English pronunciation of the word "file". Which way can I make this word correctly pronounced by a speech synthesizer, since it is "a standard extension" of the Italian language (a speech synthesizer will pronounce "file" according to traditional phonetic rules)? b. Every natural language has phonetic rules of his own. Many words and phrases written in English (or in French) can be understood from Italian listeners _only if_ they are pronounced according to phonetic rules different from traditional phonetic Italian rules, but often also _very different_ from English phonetic rules. It is indeed a third language, different both from Italian and from English. If we mark these words and short sentences as foreign text, i.e. as English, a compliant speech synthesizer will pronounce them in such a way an Italian listener will be very likely not able to understand. I think only assistive technologies can improve accessibility in similar situations. The requirement in Guideline 3.1 L2 SC3 could be a remedy worse than the disease. Another not secondary issue: the absolute majority of Italian authors and web developers are totally unaware of the difference between actual foreign pronunciation of many foreign words used in Italian, and the adapted pronunciation of the same words, used from Italian mother tongue speakers. When they use in their web pages foreign words (it happens very often in technical writings), they are really thinking at the Italian, adapted pronunciation of those words. c. Some proper nouns are adapted transcriptions in latin characters from foreign alphabets. These nouns become meaningful for an Italian listener, only if they are pronounced according to phonetic rules different from typical Italian phonetic rules. For example, "Sharon" and "Shimon Peres", names of very famous Israeli politicians, contains the group "sh", a phoneme not used in Italian. How a web developer should mark those names? They are clearly not Italian words. But they are not even Hebrew words. They are rather adaptations of foreign words in latin characters: a strange mixture, for which it isn't clear whether or not we need a rule for improving their accessibility. My opinion is we don't need such a rule. Assistive technologies have to incorporate for each natural language always larger dictionaries of adapted foreign pronunciation: this is the only way in which foreign, isolated words can be made understandable. In conclusion, I think Level 2 Success Criteria for Guideline 3.1 should be removed, at least for pronunciations. It is inconceivable that Web developers can manage such an ambiguous and complex matter. It would be much more useful and pragmatic if the task of managing changes in the natural language of the content were completely delegated to assistive technologies. Hoping this can help. Michele Diodati ---------------------------------- http://www.diodati.org ----------------------------------
Received on Tuesday, 28 December 2004 23:09:39 UTC