RE: Checkpoint 3.1 - Identifying language (authoring burden) from Y.P. Hoitink on 2003-09-05 (w3c-wai-gl@w3.org from July to September 2003)

From: Y.P. Hoitink <y.p.hoitink@heritas.nl>
Date: Fri, 5 Sep 2003 16:53:45 +0200
To: "'Al Gilman'" <asgilman@iamdigex.net>, <w3c-wai-gl@w3.org>
Message-ID: <000201c373bd$85044a90$7b00a8c0@Adder>

> -----Original Message-----
> From: w3c-wai-gl-request@w3.org
> [mailto:w3c-wai-gl-request@w3.org] On Behalf Of Al Gilman
>
> At 06:19 PM 2003-09-04, Y.P. Hoitink wrote:
> [snip]
> > [Some languages use a lot of foreign words]
> >To meet this success
> >criteria, all [instances] of these [foreign words] would have to be 
> >labelled with their original language [which is a heavy burden on
authors].
> 

[Al: UA can use lexicon with pronounciation for normally mis-pronounced
terms]

I see the benefit of this development for users, but this doesn't relieve
the authors from the obligation to identify the language of each foreign
phrase. Your remark is user-oriented, where I'm focussing on the authors of
the web content here. Just because some users have user agents that know how
to pronounce a foreign phrase, doesn't mean authors don't have to meet this
success criteria. Therefore, I still think this success criteria is too
strict.

[Al: Are these English words in the Dutch dictionary?]
There are two types of foreign words and phrases used in Dutch:

* The ones who have become incorporated in the Dutch language to such an
extend that they have been included in the Dutch dictionary. These are
mostly words, not phrases. Words or combined terms like "webmaster",
"website", "self-fulfilling prophecy" and "contentmanager" can be found in a
good Dutch dictionary.

* English phrases that are used as-is, without them becoming integrated into
the Dutch language. These include phrases like "on the job training",
"management by walking around", "server-side scripting", etc. These are not
likely to be included Dutch dictionaries since they are not words but
phrases. 

For the first group, you can argue that they don't have to be labelled with
a foreign language since the words have been incorparated into Dutch.
However, the Dutch pronounciation is totally different so labelling these
phrases with their original language does help accessibility a lot. I
personally think it's a best practice to identify the language of these
types of foreign words as well.

The second group poses the most problems. Because they tend to be longer,
it's harder to make sense of the meaning from the context, and they can't be
found in a Dutch dictionary. But we're getting into cognitive aspects of
foreign texts here, whereas my original observation was about identifying
the language to benefit screen readers. 

What occurs to me now because of your remarks is that one other benefit from
identifying the language of a fragment, is that it allows a user to know
which dictionary to use when looking up the meaning of that phrase. Of
course, user agents would have to enable the user to find out the language
to do that. This benefit isn't listed in 3.1, and perhaps could be.

Yvette Hoitink

Received on Friday, 5 September 2003 10:54:06 UTC