- From: Michael Cooper <michaelc@watchfire.com>
- Date: Fri, 5 Sep 2003 10:37:02 -0400
- To: w3c-wai-gl@w3.org
First to respond to the question about why identifying the primary language of the document has less emphasis than identifying changes: I've wondered about that myself a lot. On a Thursday call sometime in the past couple months the explanation was given that, if your screenreader doesn't pick up the primary language it will just start reading gibberish, and you'll know something is wrong and do something about it, like leave the page or change your language settings; but if most of the page is reading correctly and suddenly you encounter a gibberish phrase, you know you're missing content on a page that was otherwise accessible to you. However, on a Wednesday call a about a month ago, Richard Ishida of the Internationalization group said he considered identifying the primary language equally important. He was going to send a review note about this so it will be put on the agenda for discussion at some point. As far as the prevalence of non-Dutch (e.g., English) words in Dutch, like "website", here is an opinion which may or may not be considered accurate by people with linguistic backgrounds, but it's something to throw out there. I would propose that the incorporation of words like "website" into Dutch is simply an example of the borrowing of words between languages that happens all the time. I've studied French, German, and Latin and been amazed by how many words from those languages are found in English - sometimes retaining their original meaning, sometimes with a related but different meaning. Yet we consider them English words now, because they've been part of the language long enough that English speakers recognize them. The only real difference with your example is recency - words like "website" were imported during the lifetime of current speakers, so they may view them as foreign words - but the meaning of them is understood all the same. I would say that for all practical purposes those words can be considered part of the Dutch language, since in a few generations people will no longer see them as non-Dutch anyway. Under this logic, such words would be exempt from the requirement to be marked up specially, since they are not 'foreign' words in the sense that you have to have an understanding of the foreign language to understand the phrase. The question of pronunciation still exists and may nullify my argument. I don't know how "website" is pronounced in Dutch - is it pronounced the way it is in English, or is it pronounced the way a screenreader might applying Dutch pronunciation rules? If the screenreader pronounces it one way but the common pronunciation is different, there might still be a problem. Yet I would say first that there are plenty of words screenreaders mispronounce, because of homonyms, regional accents, etc., and people deal with it. Second, if by my logic above the word is effectively part of the Dutch language, the screenreader ideally should come already designed with the ability to pronounce it correctly, if only via an exceptions dictionary. This is similar to the argument that only words that are not in an unabridged dictionary for the language fall under this rule. The dictionary authors may not have caught up to common practice, but it's common practice that is important for people's ability to access the content. Of course that makes testability more difficult... Michael > -----Original Message----- > From: Y.P. Hoitink [mailto:y.p.hoitink@heritas.nl] > Sent: Thursday, September 04, 2003 6:22 PM > To: w3c-wai-gl@w3.org > Subject: Checkpoint 3.1 - Identifying language > > > > Hello everyone, > > I have been going through the internal working draft of the WCAG 2.0 > guidelines <http://www.w3.org/WAI/GL/WCAG20/> and came across core > checkpoint 3.1 (Language of content can be programmatically > determined). > > The success criteria for checkpoint 3.1 reads: > "passages or fragments of text occurring within the content > that are written > in a language other than the primary natural language of the > content as a > whole, are identified, including specification of the language of the > passage or fragment. " > > Meeting this success criteria is a heavy burden for people writing in > languages which use a lot of foreign words. For example, my > native language > Dutch uses a lot of English words and phrases , like > "website", "software > engineering", "content management", "on the job training" > etc. On a typical > Dutch page, especially on a technical or management subject, several > percents of the words can easily be foreign. To meet this > success criteria, > all of these would have to be labelled with their original language. > > This isn't a problem in English so much, since there aren't > so many foreign > words. I guess this is why we hear the "je ne sais quoi" > example all the > time :-) But we're writing these guidelines for the entire > internet, not > just the English speaking community. > > I understand that labelling foreign phrases with their language would > increase accessibility. If English words are spoken out loud > with Dutch > pronounciation by speech synthesizers, they are hard to > understand because > the words sound quite different. > > But it is quite a lot to ask to label everything for people writing in > languages which use a lot of foreign words. Also, even though they are > harder to understand, the phrases can still be heard so it's > not _totally_ > inaccessible. I don't think this issue should be on the same level of > requirements as, for example, the need for text equivalents > for non-text > content. > > What I'm trying to say is this: Shouldn't this be a best > practice instead of > a required success criteria? > > On the other hand, I was surprised to find that the > identification of the > _main_ language of the document is only a best practice, and > not a success > criteria. So we are requiring to label every foreign phrase, > but we don't > even know what the main language is? Couldn't this lead to > the unwanted > situation that 99% of the document is read with the wrong > pronounciation, > except for the few foreign words whose language _was_ identified? > > I have been trying to find past discussions about this in the > mail archives > but haven't seen these issues. If I missed them, I would > appreciate it if > someone could send me the links. > > Yvette Hoitink > >
Received on Friday, 5 September 2003 10:42:41 UTC