RE: Examples of language changes in websites from Richard Ishida on 2003-12-04 (w3c-wai-gl@w3.org from October to December 2003)

From: Richard Ishida <ishida@w3.org>
Date: Thu, 4 Dec 2003 13:04:11 -0000
To: "'Yvette P. Hoitink'" <y.p.hoitink@heritas.nl>, "'Jens Meiert'" <jens.meiert@erde3.com>
Cc: <w3c-wai-gl@w3.org>
Message-ID: <004001c3ba67$1cc44b10$2982fea9@w3c40upc3ma3j2>
Hello Yvette,

Interesting article.  Given the, I would say, unusual linguistic dexterity and penchant for English of the Dutch people, I think you have picked a source of examples that presents an uncommon degree of subtlety and pervasiveness - but this is certainly a phenomenon that occurs in all languages to a lesser or greater extent.  It re-raises in my mind a question I've been mulling over for some time.

When tagging language variations in text, where do you draw the line between what is clearly a foreign language phrase, what is a neologism drawn from a foreign source, and what is one of the latter that has become an adopted part of the language?  

For (another) example, in English we talk about 'a certain je ne sais quoi'.  I would think that a large proportion of people in the UK would understand this half English half French phrase these days, due to the influence of recent advertising and cooking programs, etc., though I'd say that is a fairly recent phenomenon.  Is it English or French?

I particularly like your example of 'cadeaushoppen' - which is French, English and Dutch all at the same time.

I'm not clear what the answer would be.  Do you have any suggestions?  

I suppose one pragmatic approach might be to say that, to support accessibility, anything that's not in the dictionaries of the text to speech processors should be marked up.  But there appear to be a number of problems with that approach in my mind: different dictionaries presumably contain different entries; how does the average content author find out (quickly, while creating content) what words or phrases are likely to be problematic?; this looks at the problem only from the point of view of the text to speech processor; are the dictionaries being kept sufficiently up to date?; there'd presumably be neologisms in the native language that fall into this category too, where to draw the line; etc.

Another approach could be to provide phonetic transcriptions for ambiguous phrases, which could even propose a pronunciation that uses only native phonemes.  I just saw a nice example of this in the SSML spec, where they render 'La vita è bella' as '<phoneme alphabet="ipa" ph="&#x2C8;l&#x251;&#x2C8;vi&#x2D0;&#x27E;&#x259; &#x2C8;&#x294;e&#x26A;&#x2C8;b&#x25B;l&#x259;">La vita è bella </phoneme>'  [lɑ ˈviːɾə ˈʔeɪ ˈbɛlə for those of you who can read the IPA in this mail  **]  The phonemic transcription here is American (rather than British) English (uses a flap for t in vita) and not true to the actual Italian pronunciation (long ei for è, lack of lengthening for l in bella).  Unfortunately, XHTML doesn't yet provide the markup for us to do that.  Should it?  That would solve some other problems in my mind, such as how to pronounce "Caius College" (pronounced "keys college") and President Tito (pronounced "sutto"), the president of the Republic of Kiribati (pronounced "kiribass").  Or even how to pronounce acronyms like "CCAT" (sea-cat) or "NATO" (nei-toe).

But now we're getting way ahead of WCAG 2.0...

Does anyone have any other suggestions?


RI










** You can convert the NCRs to characters at http://people.w3.org/rishida/scripts/uniview/conversion.htm



============
Richard Ishida
W3C

contact info: http://www.w3.org/People/Ishida/ 

http://www.w3.org/International/ 
http://www.w3.org/International/geo/ 

W3C Internationalization FAQs
http://www.w3.org/International/questions.html
RSS feed: http://www.w3.org/International/questions.rss



> -----Original Message-----
> From: w3c-wai-gl-request@w3.org 
> [mailto:w3c-wai-gl-request@w3.org] On Behalf Of Yvette P. Hoitink
> Sent: 04 December 2003 10:21
> To: 'Jens Meiert'
> Cc: w3c-wai-gl@w3.org
> Subject: RE: Examples of language changes in websites
> 
> 
> 
> Hello Jens and list,
> 
> Thanks for your comments about my article. Judging by your 
> remarks, I haven't explained well enough what was my goal 
> with the article. A few weeks ago, guideline 3.1 was 
> discussed in the telecon. Specifically, we were talking about 
> the need to identify the language, including language 
> changes, since not doing so could lead to serious 
> accessibility problems. Wendy and other people mentioned they 
> would like some real world examples of situations where 
> changes in language which were not identified could lead to 
> accessibility problems. 
> 
> That's what I tried to do in this article. I simply meant to 
> describe different examples of language switches, and briefly 
> touch the accessibility problems that result from those. I 
> never meant to ban English words from Dutch (I would be left 
> with a decimated vocabulary myself). I totally agree with you 
> that that has nothing to do with WAI. I also didn't mean that 
> authors using a lot of English words should reword their 
> websites. I just meant to show the accessibility problems 
> that can occur because of this, which _is_ a WAI issue.
> 
> I tried to describe what happens in real life, so WCAG can 
> come up with guidelines to help the authors of that site make 
> their page accessible, English phrases and all. I did not 
> mean that people had to rewrite their pages or ban English 
> words. Making a page such as the LogicaCMG one accessible 
> would require, in my opinion, that each English phrases is 
> identified as such. 
> 
> Also, these examples can be used to see how much effort a 
> certain guideline would mean for some pages. Identifying the 
> languages of the LogicaCMG site would require a lot more 
> effort than the Shell example, for instance. 
> 
> I will go into some of the points you made in more detail. 
> 
> 
> > > http://www.heritas.nl/wcag/language.html
> > 
> > first of all, I'm not sure if 'Cadeaushoppen' is an English
> > term (CMIIW), sounds either Dutch or like a made-up word to 
> > me. 
> 
> Yes, you're right. That's what I tried to explain below that 
> list of English
> words:
> "Another Dutch phenomenon can be seen here, the hybrid word 
> "cadeaushoppen" where a Dutch word is glued to an English 
> word, apparently to make it sound more dynamic. " 
> 
> FYI: The Dutch language has a powerful mechanism for creating 
> new words, just like other languages such as German do. You 
> can simply glue together two words and make a new one. 
> "Giftshopping" would be the corresponding English word if 
> English had the same mechanism. Such words can be constructed 
> on-the-fly and because of this, many can not be found in 
> dictionaries. With a hybrid word such as 'cadeaushoppen', the 
> correct pronounciation would be French for 'cadeau' and 
> English for 'shoppen'. Tag that in your HTML :-)
> 
> > And maybe you started a biased analysis,
> > scrutinizing only international companies.
> 
> My task as I saw it was to come up with examples where 
> language changes would lead to accessibility problems. I 
> agree with you that international companies are more likely 
> to use English words and phrases and that this is not a 
> representative sample of Dutch websites. But that was not my goal. 
> 
> > PS.
> > Please excuse the provocative tenor in this post, but I claim
> > there are above all several constructive assertions.
> 
> No offense taken, I appreciate any comments about my article. 
> Your post helps me to realize I have not succeeded in 
> describing well enough what the purpose of the article was. I 
> will wait for further comments and then rewrite parts of it.
> 
> Yvette Hoitink
> CEO Heritas, Enschede, The Netherlands
> E-mail: y.p.hoitink@heritas.nl
>
Received on Thursday, 4 December 2003 08:04:25 UTC