- From: <bidi@prognathous.mail-central.com>
- Date: Wed, 17 Sep 2003 09:58:30 +0200
- To: "Mark Davis" <mark.davis@jtcsv.com>, www-international@w3.org
On Tue, 16 Sep 2003 15:14:59 -0700, "Mark Davis" <mark.davis@jtcsv.com> said: > The bidi algorithm was designed in full knowledge that it would not be > able to handle all ordering cases, Right now the algorithm doesn't provide an acceptable solution for Hebrew users, as it breaks the rendering of most existing texts. > because there is often not enough information in the text to provide > for the right ordering, Real life implementations show that there is more than enough information to define a strict set of rules on how to deal with HebrewLetter+HyphenMinus+Number sequences, without facing any false positives. To the best of my knowledge, there are no cases in the Hebrew language where a negative number is preceded by Hebrew letter without another HyphenMinus/Maqaf in between ("-20ýä"). Since there's no ambiguity here, it should be very much possible to revise the algorithm so that it deals with such sequences. > or there are inconsistencies between different usage patterns, Which usage patterns exactly? I can't think of one that this revision will break. > or the rules to do so would be too complex. These rules are already set and implemented by Microsoft and other vendors such as Mellel for OS X. Is it better to keep the UBA a little less complex, but inadequate for the proper rendering of most existing texts? > For that reason, it supplies various mechanisms to override the normal > ordering results. Which don't help one bit when rendering existing texts. > Corresponding mechanisms have been developed for HTML and internally in > word processing modules. HTML requires knowledge that most users don't have. Moreover, it doesn't help when dealing with plain text. As for word processing modules, what set of rules should they follow? Why not add a single, standard set of rules that deal with such cases to the UBA? > Such overrides should be added to the text when being composed > or edited. As I said, this suggestion dosen't help for the rendering of existing texts. > (Added just before rendering is not recommended, since the text would > appear different than on systems that don't have this special override. So, what solution does the UBA has to offer for dealing with HyphenMinus+Number sequences in existing texts? > If the Maqaf is a necessary character for Hebrew, then you may wish to > lobby those organizations supplying Hebrew keyboards to get it added. "I'm working on it, but there are currently several obstacles that complicate this campaign: 1. Badly rendered Maqaf glyphs in most common fonts (it's usually too high). http://exego.net/forums/showMessage.asp?i=9320&qs= 2. The Maqaf and some other punctuation marks are not included in the Israeli Keyboard Layout Standard (SI-1452). This may hopefully change, but it takes time to convince everyone on TC-2109 that adding these marks would be a worthwhile move. 3. It may not be easy to educate users to accept and use the correct Hebrew punctuation marks, instead of foreign ones. 4. Data integrity issues have to be taken into consideration (e.g. searching Hebrew texts for Maqaf/Minus, Geresh/Apostrophe, and Gershaim/Quotes) All of these points are important and once solved, would mean that the Maqaf could be a viable solution, but the fate of existing texts is just as important (and is the main subject of this thread)." Prog.
Received on Wednesday, 17 September 2003 03:58:36 UTC