Re: bidi discussion list was: Bidi Markup vs Unicode control characters

I would like to see your list of languages using RTL scripts.

The only scripts identified as RTL in Unicode are Arabic and Hebrew. (Then 
there is the strange case of Mongolian which is marked as LTR but I think 
should be treated as "RTL rotated to read top-down".)

As you & I both said, nearly all of the cases where you would need to mark 
directionality are either detected by the Unicode BiDi Algorithm or 
coincide with semantic markup, I would need to question the need to do 
anything further, especially if preferred directionality (the embed cases) 
can be tied to xml:lang (which I think will cause the override cases to 
disappear).
   Maybe the language/country/script codes should be fixed/completed rather 
than adding another "layer" of markup or a new xml:dir attribute, I think 
these are tightly coupled and need to be identifiable in general XML 
markup. I question the value of marking them independently (it's hard 
enough to get people to mark language at all, let alone correctly). I would 
prefer to see a markup/attribute approach in XML rather than 
character-level Unicode markers, since it is too easy to get the markup and 
the markers out of sync. (Except for people on these working groups, most 
people tend to understand "incremental state" or "hierarchical markup", but 
not both. -- Mixing the two is asking for trouble.)


At 2005.08.15-16:54(+0900), Martin Duerst wrote:
>At 15:32 05/08/10, Ognyan Kulev wrote:
>
> >Isn't direction implied in xml:lang which is part of the core XML spec?
>
>No. This was considered for html, but the problem is that there are too
>many languages written with Arabic, Hebrew, and other RTL scripts, and
>some of them (the rarer ones) may not yet have a code, so it would be
>difficult to implement this in a browser. That's why it was rejected.
>
>Regards,    Martin.


---Steve Deach
    sdeach@adobe.com 

Received on Monday, 15 August 2005 14:47:00 UTC