RE: English words in Hebrew text RE: Report for ISOC IL FTF from lisa seeman on 2003-12-24 (w3c-wai-gl@w3.org from October to December 2003)

From: lisa seeman <seeman@netvision.net.il>
Date: Wed, 24 Dec 2003 10:23:07 +0200
To: "'Charles McCathieNevile'" <charles@w3.org>, "'Richard Ishida'" <ishida@w3.org>
Cc: "'WAI-GL'" <w3c-wai-gl@w3.org>, "'Martin J. Durst'" <duerst@w3.org>
Message-ID: <024701c3c9f7$2be37d60$ad00000a@patirsrv.patir.com>
English words in Hebrew text aren't considered as Hebrew, It is just an
example of typical usage.

I do not see it as a problem for internationalization. He Lang tag is
correct from the initialization stand point, in the main because no
I=one is pushing for legislation to require correct encoding of HTML
because of internalization issues. 
Accessibility is a different story, I think part of the thinking for
having the requirement for the Lang tag at minimal accessibility was
because for some pages it is very important and it is not too
burdensome.  However , in Israel it is burdensome (unless people are
using transcoding - like SWAP :) ) and on the other hand, the importance
to accessibility is questionable. Assistive technologies and user agents
used in Israel default to the English were the encoding is in the Latin
character set - unless directed otherwise. That may not be ideal and is
not because English is that prevalent generally, (there are more Russian
and Arabic speakers) but the because of the strong English speaking
influence on  net culture and the user agent need to work with the
reality of what is out there.  

So we have a strange situation were for Israel we have a P1 success
criteria that will affect most pages in Hebrew, is considered
burdensome, and is not partially helpful to accessibility.

That would  perhaps justify modifying  the wording of the criteria to
exclude this type of occurrence were user agents are extremely likely to
decode any language change correctly themselves. 

All the best
Lisa Seeman
 
Visit us at the UB Access website
UB Access - Moving internet accessibility
 


-----Original Message-----
From: w3c-wai-gl-request@w3.org [mailto:w3c-wai-gl-request@w3.org] On
Behalf Of Charles McCathieNevile
Sent: Tuesday, December 23, 2003 5:53 PM
To: Richard Ishida
Cc: 'lisa seeman'; 'WAI-GL'; Martin J. Durst
Subject: English words in hebrew text RE: Report for ISOC IL FTF



In other words, the ISOC group in Israel are possibly doing one of two
things:

1. Asserting that there are a lot of english words which are recognised
as hebrew, although requiring a different pronunciation. Fortunately,
they are easy to spot because they use a different alphabet, and the
tools recognise them.

This would seem to be a fine claim to make - although I would like to
see some connection to what the vocabulary they are using is...

2. Assuming that words included in hebrew text that are written in latin
script are automatically english. I trust this isn't their assertion.

cheers

Chaals

On Mon, 22 Dec 2003, Richard Ishida wrote:

>
>Hi Lisa,
>
>> From: w3c-wai-gl-request@w3.org [mailto:w3c-wai-gl-request@w3.org] On

>> Behalf Of lisa seeman
>> Sent: 22 December 2003 05:55
>
><snip>
>
>> passages or fragments of text occurring within the content that are 
>> written in a language other than the primary natural language of the 
>> content as a whole, are identifiable, either through the character 
>> encoding used or through direct including specification of the 
>> language of the passage or fragment. [X]
>
>Character encoding information helps you know the script, which may be 
>useful for font selection or some other rendering considerations, but 
>doesn't help you with selecting the right voice for pronunciation of 
>the text.  For example, ASCII text could just as easily be Indonesian 
>or Malaysian as English.  Text using 'Latin1' characters could 
>represent a very wide range of languages. So 'either through the 
>character encoding used' would be inappropriate, unfortunately.
>
>To help me better understand the issue, could you briefly characterise 
>for me the type of content that causes the problem?  Is it English? How

>much of it is there (as a very rough average)?  Is much of it acronyms?

>proper names? technical words? etc.
>
>Exploring solutions: can one assume that Israeli text to speech systems

>can deal pretty well with the embedded non-Hebrew stuff?  Does that 
>apply to the tts systems dealing with other languages?  If Hebrew 
>systems deal with English ok, maybe you'd only have to label stuff that

>was, say, Indonesian or Malay??
>
>RI
>

Charles McCathieNevile  http://www.w3.org/People/Charles  tel: +61 409
134 136
SWAD-E http://www.w3.org/2001/sw/Europe         fax(france): +33 4 92 38
78 22
 Post:   21 Mitchell street, FOOTSCRAY Vic 3011, Australia    or
 W3C, 2004 Route des Lucioles, 06902 Sophia Antipolis Cedex, France
Received on Wednesday, 24 December 2003 03:27:45 UTC