RE: proposed question: which languages are RTL?

Hi Tex,

See my inline comments...

============
Richard Ishida
W3C

tel: +44 1753 480 292
http://www.w3.org/International/
http://www.w3.org/People/Ishida/



> -----Original Message-----
> From: public-i18n-geo-request@w3.org 
> [mailto:public-i18n-geo-request@w3.org] On Behalf Of Tex Texin
> Sent: 10 July 2003 09:09
> To: GEO
> Subject: proposed question: which languages are RTL?
> 
> 
> 
> Here is a start on this question-
> 
> Which languages are right-to-left (RTL)?
> 
> Background
> 
> This is a common question, although incorrectly phrased. 
> Knowing which languages are right-to-left is important to web 


Given your opening comment I don't' think you can say "which languages
are right-to-left" - perhaps something like "which languages are
associated with right-to-left text"


> designers and authors, because the so called right-to-left 
> languages are more complicated to work with and the 
> organization and directionality of the page layout are 
> affected. Therefore, knowing the writing direction can be 
> relevant to estimating the work involved to create web pages 
> in a new language.
> 
> Why is the question incorrectly phrased? There are 2 
> inaccuracies within this question. First, languages don't 
> have a writing direction, the script used to write them 
> determines the direction. For example, Yiddish is generally 
> written in the Hebrew script, which is right-to-left. But it 
> can also be written using the Latin script which is left-to-right.
> 
> The second inaccuracy concerns the use of the term "right-to- 
> left". Although the majority of the text will be written 
> right- to-left, numbers are still written left-to-right 
> (LTR). In addition, right-to-left text will often include 
> borrowed or foreign words written in their native 
> left-to-right script, and so the text is mixed 
> directionality. The proper term therefore is "bidirectional". 

I don't feel comfortable with the next bit...

> However, "right-to-left" is very commonly used, and as long 
> as it is understood that a script with a "right-to-left" 
> writing direction is in fact bidirectional, the terms 
> "right-to-left" and "bidrectional" can be used 
> interchangeably. 

I don't think we need to lay it on people heavily, but I think they're
better off learning to call it bidirectional rather than rtl.  If only
so that they don't get corrected by script geeks at Unicode conferences,
but also because it reminds them constantly that enabling these scripts
is not just mirroring.

I think you should also introduce 'bidirectional's little brother,
'bidi' at this point, too.
 

I think it might be better to be more specific about what people will
gain from following this link before they click on it. It may also be
better to add it to a links section.  Also, I looked at this stuff and
thought it wasn't tremendously helpful for beginners.  
> There is more information on the different 
> directionalities of scripts in: 
> http://www.unicode.org/faq/middleeast.html

You could also point to sites such as 
http://www-3.ibm.com/software/globalization/topics/bidi/index.jsp
http://www.microsoft.com/globaldev/handson/dev/Mideast.mspx
Etc.

> 
> Answer
> 
> Languages 
> generally do have a preferred script and
> writing direction. The following scripts are bidirectional, 
> and therefore languages written in these scripts are also
> bidirectional: Hebrew, Arabic, Syriac, Thaana

You can't say "languages ... have a preferred ... writing direction" as
you pointed out earlier.  How about "a preferred script with a given
writing direction".
I'd suggest you drop the first para, since you repeat the information
immediately below.

> 
> The following languages are generally written in bidirectional
> scripts:
> 
> Hebrew, Yiddish, Ladino,
> Arabic, Farsi/Persian, Syriac, Avesta, Kök Turki, Manchu, 
> Middle Persian, Mongolian, Sogdian, South Arabic, Uighur, 
> Maldivian, Urdu, Kazakh, Uzbek, Tajik. Malay, Swahili, Hausa, 
> Algerian Tribal, old Malay, Baluchi, Kashmiri, Sindhi, 
> Pashto, Landha, Dargwa, Morrocan Arabic, Adighe, Ingush, 
> Berber, Kurdish, Jawi/Javanese.

I think we need to check this list carefully.  
-	Malay is currently written with ASCII.  Formerly Malay was
written with an Arabic-derived script called Jawi.
-	I'd suggest separating out historical scripts
-	I think it would be interesting and possibly germaine to provide
information about the number of speakers of these languages (maybe a
total) - its an impressive number - you can get such information from
the SIL ethnologue
-	what is Morrocan Arabic?  Is it equivalent to Egyptian Arabic,
Lebanese Arabic, etc.?

> 
> Note that this list, of necessity, is not complete. There are 
> too many languages in existence to identify them all here.
> 
> Note that languages written in Latin, Slavic, Cyrillic, 
> (Modern) Greek and Thai scripts are left-to-right.
> 

I'm inclined to add what remains below to a "By the way..." section.
It's a bit of an edge case.


> Ideographic languages are more flexible in their writing 
> direction. They are generally written left-to-right, or 
> vertically top-to-bottom (with the vertical lines perhaps 

Not perhaps, always.  Mongolian columns flow ltr, but this is not an
ideographic script.

> proceeding from right to left). However, they may also, 
> optionally, be written right-to-left. Chinese newspapers 

I'd say 'optionally' -> 'occasionally'.
Be cautious here.  It's quite rare these days, although it wasn't before
World War 2.

> sometimes combine all of these writing directions on a page. 
> Fortunately for web designers and authors, in this case, the 
> direction is up to the designer.

Not sure what this means.


Hope that helps,

RI



> 
> -- 
> -------------------------------------------------------------
> Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
> Xen Master                          http://www.i18nGuy.com
>                          
> XenCraft		            http://www.XenCraft.com
> Making e-Business Work Around the World
> -------------------------------------------------------------
> 

Received on Wednesday, 23 July 2003 14:30:08 UTC