W3C home > Mailing lists > Public > www-international@w3.org > July to September 2003

Re: The fate of Hebrew texts with Hyphen-Minus instead of Maqaf

From: <bidi@prognathous.mail-central.com>
Date: Tue, 16 Sep 2003 19:05:21 +0200
To: "Mark Davis" <mark.davis@jtcsv.com>, www-international@w3.org
Message-Id: <20030916170521.4BE78651EC@smtp.us2.messagingengine.com>

On Tue, 16 Sep 2003 07:56:35 -0700, "Mark Davis" <mark.davis@jtcsv.com>
said:
> 1. The Unicode BIDI algorithm does specify the rendering order of the
>    sequence HebrewLetter+HyphenMinus+Number. If that rendering order is
>    not what is desired, then it also provides a way to override it.

The way the UBA specifies it, the rendering order of such sequences is
practically *always* not as desired.

As for overriding it, I can see how that may be a valid solution while
typing new texts, but how can it be done when rendering existing texts?
Would you recommend to plant control characters into existing texts
before rendering them?

> 2. Unicode BIDI algorithm is a rendering algorithm. It has nothing to
>    do with keyboards.

It is due to a keyboard limitation that most existing Hebrew texts are
not rendered by the UBA as intended by their authors. It's all quite
simple:

1. The Maqaf is not available in the Hebrew keyboard layout. Due to this
   limitation, users type Hyphen-Minus instead.
2. As a result, virtually all Hebrew texts include sequences of
   HebrewLetter+HyphenMinus+Number instead of HebrewLetter+Maqaf+Number.
3. These sequences are not rendered properly by applications that
   implement the UBA.

Here's a typical example:

The English phrase "The 20th century" is translated into Hebrew as:
  " 20"

Due to the lack of Maqaf, most people have to use a Hyphen-Minus:
  " -20"

In applications that implement the UBA, you will find the following:
  "-20 "

Not only is this order wrong (and an eyesore), but it also breaks the
meaning of the sequence. Instead of a positive number, you get a
negative one.

I hope this is clearer.

Prog.

> __________________________________
> http://www.macchiato.com ►  &#65533;Eppur si muove&#65533; ◄
>
> ----- Original Message ----- From: <bidi@prognathous.mail-
> central.com> To: <www-international@w3.org> Sent: Monday, September
> 15, 2003 12:45 Subject: RE: The fate of Hebrew texts with Hyphen-
> Minus instead of Maqaf
>
>
> >
> > I'd like to wrap this up.
> >
> > My understanding is that the Unicode BiDi Algorithm does not provide
> > a solution for rendering of *existing* Hebrew texts that include
> > sequences of HebrewLetter+HyphenMinus+Number, nor does it provide a
> > solution for entry of such sequences with current systems that do not
> > map the Hebrew Punctuation Maqaf to the keyboard.
> >
> > Any objections to the above conclusion?
> >
> > Prog.
> >
> > On Sun, 24 Aug 2003 19:18:04 +0200, bidi@prognathous.mail-
> > central.com said:
> > >
> > > On Sun, 24 Aug 2003 16:27:31 +0200, "Jony Rosenne"
> > > <rosennej@qsm.co.il> said:
> > > > For Hebrew, the Maqaf should be used.
> > >
> > > I fully agree that the Maqaf should be used. In fact, I actually
> > > created a customized Hebrew keymap that replaces the non-numpad Hyphen-
> > > Minus with the Maqaf, and this is what I use when writing Hebrew,
> > > but... there are massive amounts of *existing* texts that use Hyphen-
> > > Minus instead (virtually all of them). What will be their fate?
> > > "are they doomed forever to render wrongly under applications that
> > > use the Unicode BiDi algorithm?"
> > >
> > > > Handling the change and the conversion has not been seriously
> > > > tackled in any major environment.
> > >
> > > I'm working on it, but there are currently several obstacles that
> > > complicate this campaign:
> > > 1. Badly rendered Maqaf glyphs in most common fonts (it's usually
> > >    too high). http://exego.net/forums/showMessage.asp?i=9320&qs=
> > > 2. The Maqaf and some other punctuation marks are not included in
> > >    the Israeli Keyboard Layout Standard (SI-1452). This may
> > >    hopefully change, but it takes time to convince everyone on TC-
> > >    2109 that adding these marks would be a worthwhile move.
> > > 3. It may not be easy to educate users to accept and use the
> > >    correct Hebrew punctuation marks, instead of foreign ones.
> > > 4. Data integrity issues have to be taken into consideration (e.g.
> > >    searching Hebrew texts for Maqaf/Minus, Geresh/Apostrophe, and
> > >    Gershaim/Quotes)
> > >
> > > All of these points are important and once solved, would mean that
> > > the Maqaf could be a viable solution, but the fate of existing
> > > texts is just as important (and is the main subject of this
> > > thread).
> > >
> > > Any suggestions?
> > >
> > > Prog.
> > >
> > >
> > > >
> > > > Jony
> > > >
> > > > > -----Original Message----- From: www-international-
> > > > > request@w3.org [mailto:www-international-request@w3.org] On
> > > > > Behalf Of bidi@prognathous.mail-central.com Sent: Wednesday,
> > > > > August 20, 2003
> > > > > 12:23 AM To: www-international@w3.org Subject: The fate of
> > > > >    Hebrew texts with Hyphen-Minus instead of Maqaf
> > > > >
> > > > >
> > > > >
> > > > > For the sake of the argument, let's assume that Hebrew
> > > > > Punctuation Maqaf is now part of the official keyboard layout;
> > > > > that it is implemented well (both in fonts and keymap) in all
> > > > > major operating systems; and that users of Hebrew accept the
> > > > > new addition and start to use it from then on. What will be the
> > > > > fate of all Hebrew texts that used Hyphen-Minus instead? are
> > > > > they doomed forever to render wrongly under applications that
> > > > > use the Unicode BiDi algorithm? by wrong, I strictly refer to
> > > > > the way the original authors intended them to render.
> > > > >
> > > > > Further discussion about this problem can be found here:
> > > > > > http://bugzilla.mozilla.org/show_bug.cgi?id=73251#c32
> > > > >
> > > > > Prog.
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> >
> >
>
Received on Tuesday, 16 September 2003 13:05:33 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:00 GMT