RE: simple language testable thing from Charles McCathieNevile on 2004-02-04 (w3c-wai-gl@w3.org from January to March 2004)

From: Charles McCathieNevile <charles@w3.org>
Date: Wed, 4 Feb 2004 13:53:52 -0500 (EST)
To: Mike Barta <mikba@microsoft.com>
Cc: lisa seeman <seeman@netvision.net.il>, Jens Meiert <jens.meiert@erde3.com>, y.p.hoitink@heritas.nl, w3c-wai-gl@w3.org
Message-ID: <Pine.LNX.4.55.0402041340410.14689@homer.w3.org>

On Wed, 4 Feb 2004, Mike Barta wrote:

>
>so I would need to translate serbian within a hrvatskii page but not latin in an english page?

No, I think Lisa's original point was that this is easy for hebrew (since the
only other language I know of written in the hebrew alphabet is yiddish. On
the other hand, maybe that's more common a use case than we suspect - I don't
know since I can't read either of them).

>personally I can see cases where use of a foreign bon mot, even though
>readers may not know the meaning, or a foreign acronym, e.g. CERN, is
>appropriate without translation.  in such cases the author must decide what
>they want to do and whether the use is appropriate for their audience.

I think this issue is related to the question of what is clear language. I
think there is a fair argument that "bon mot" is an english phrase in the
rich english of literature (or the literary english of the rich, perhaps).

But it isn't simple vocabulary one can expect of everyone. I think the
solution technique is the same as for complex vocabulary - being able to do a
glossary lookup, or even run the document through a proxy that does one
automatically, perhaps giving a result like this (but not this one - this
isn't baked enough):

... use of a foreign <a href="http://example.com/k-7glossary?bon_mot">bon
mot</a>, even though...

or even adding a helpful style sheet and giving the following

... use of a foreign <ruby class="coolGloss"><rb>bon mot</rb><rt>clever word
or two</rt></ruby>, even though...

This sort of thing is done automatically by systems such as the idea of smart
links that was floated by Microsoft a while ago, or WikiWords which are
automatically identified by Wiki systems. The Microsoft system I think ran
from a glossary, whereas WikiWords are triggered by a (slightly) more
powerful pattern match. I believe that industrial text translation
support software does this sort of thing routinely, but I haven't tried it.

> I agree with the intent of your suggestion but the impact of it could be to
>ban nearly all english literature from the web due to the many uses of
>foreign phrases and obscure words.  this issue is fraught with subjective
>calls.

Well, not ban. Just state (if we adopt the proposal) that according to WCAG,
lots of literature is not accessible to everyone who speaks the base
language it was written in. Which strikes me as uncontroversial.

Cheers

Chaals

>Lisa:
>Words written in a different alphabet to that of the primary natural
>language of the plain are foreign words and should have a translation
>provided...
>
>> Lisa, earlier:
>> > In Hebrew (for once ) this is easy.
>> > A foreign word is written in a different character set.
>>Jens:
>> CMIIW, but since the UCS (Universal Character Set, often
>> referred to as
>> Unicode) is the document character set for HTML/XML, they
>> (foreign words) ain't written in a different character set.
>>
>> Again referring to to John (see my last post [1]) I claim
>> this is an issue where unimpaired users are affected as well.
>> Also, I don't see any need for ruling language use by the WAI
>> WG (there already was such a discussion a few months ago [2] ;).

Received on Wednesday, 4 February 2004 13:54:05 UTC