W3C home > Mailing lists > Public > www-style@w3.org > December 2010

Re: Need to clarify the effects of bidi paragraph breaks

From: timeless <timeless@gmail.com>
Date: Sun, 19 Dec 2010 14:31:28 +0200
Message-ID: <AANLkTing_9_Px8C9pf-kAmFiLef1uCzT0ngjwoaMrmL7@mail.gmail.com>
To: Alan Gresley <alan@css-class.com>
Cc: Ambrose LI <ambrose.li@gmail.com>, "Aharon (Vladimir) Lanin" <aharon@google.com>, W3C style mailing list <www-style@w3.org>, fantasai <fantasai.lists@inkedblade.net>, "public-i18n-bidi@w3.org" <public-i18n-bidi@w3.org>
On Thu, Dec 16, 2010 at 12:57 PM, Alan Gresley <alan@css-class.com> wrote:

>>> TO BE<br>
>>>  OR NOT TO BE?

>> <div dir=ltr>
>> <span dir=rtl>
>> להיות
>> <br>
>>  או לא להיות?
>> </span>
>> <div>-- hamlet, in rtl translation.</div>

> Ah ha, now that's a clue. This also stresses my point regarding have words
> of various script that can be meaningful for someone who doesn't understand
> them. I presume this,
>
>   להיות
>
> is Hebrew. A Side note. As I pasted it above, the end of line is on the left
> and the start (home on keyboard) is on the right. This is new for me.
>
> Now what does this word mean?

It's a strict translation, and translate.google.com would work

http://translate.google.com/#iw|en|להיות%20או%20לא%20להיות%20%3F%0A

ל-היות = to-be
או  = or
לא = not

i've included a hyphen because <to> (-ל) is a prefix in Hebrew and it
can be thus parsed independently of the rest of the word.

> What I would like to see is something that is
> recognizable to the eye but has meaning. I know this 漢字 which is written in
> Kanji and I believe is the characters for the word Kanji. My Mongolian
> script (done with Unicode entities &#x1828;) ᠨᠶᠪᠧᠺᠴᡗ is a run of Mongolian
> letters. I wouldn't know if it spells anything.

It has for well over a decade been traditional to use ALL CAPS to mean
<something written in a RTL language>.
iirc when dealing with markup notation the ALL CAPS bits are
serialized in logical order, and when dealing with presentation you'd
be thinking about them in visual order (i.e. what you'd see so you can
compare with how it was written in the markup). Note that words in
translation of course have differing lengths, but the value is in
being able to recognize when a word has been rotated or when a pair of
words has been swapped, not in the actual letters which you as a non
speaker are not expected to understand. English is the lingua franca
here (not Italian, thanks).

> There must be some way that we can communicate and all understand. I know
> Unicode for  less-than and greater-than but I do not know atm what I need to
> use that allows the start of the test flow on the right.

the use of ALL CAPS is designed more or less to address this. sadly i
can't find a handy reference on this.

http://www.i18nguy.com/MiddleEastUI.html and linked pages are probably
worth a read (but don't seem to be particularly helpful, so read them
but don't expect to get a good understanding of ALL CAPS from them)
Received on Sunday, 19 December 2010 12:31:58 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:20:35 GMT