Re: Styling of embedded right-to-left text in source code visualization from Najib Tounsi on 2007-09-11 (public-i18n-its@w3.org from July to September 2007)

From: Najib Tounsi <ntounsi@emi.ac.ma>
Date: Tue, 11 Sep 2007 14:21:26 +0000
To: Martin Duerst <duerst@it.aoyama.ac.jp>
CC: Felix Sasaki <fsasaki@w3.org>, Richard Ishida <ishida@w3.org>, public-i18n-its@w3.org
Message-ID: <46E6A466.7070009@emi.ac.ma>
Hi Martin,

Martin Duerst wrote:
> Hello Felix, Najib, others,
>
> For some work on this issue, please also see my IUC28 paper at
> http://www.sw.it.aoyama.ac.jp/2005/pub/IUC28-bidi/IUC28.html
> and the simulation page at
> http://www.sw.it.aoyama.ac.jp/cgi-bin/bidi-source-test
> and some additional info at
> http://www.sw.it.aoyama.ac.jp/2005/pub/IUC28-bidi/.
>
>   
Many thanks for these  links.

I tested your simulation bidi-source-test, it is excellent.
 
But for
<phone> 12 34 56</phone>
it gives
<phone> 12 34 56</phone>
which is OK.
and for (uppercase is arabic)
<phone>MOBILE: 12 34 56</phone>
it gives
<phone> 56 34 12 :ELIBOM</phone>

Why phone digits are listed in RTL?

> At 04:17 07/09/08, Najib Tounsi wrote:
>   
>> Hi Felix,
>>
>> Felix Sasaki wrote:
>>     
>>> Hi Najib and Richard,
>>>
>>> at least with Najib we discussed styling of "right-to-left" text in source code visualization a while ago, 
>>>       
>> Yes,
>> http://lists.w3.org/Archives/Public/public-i18n-its/2007AprJun/0036.html
>> http://lists.w3.org/Archives/Public/public-i18n-its/2007AprJun/0038.html
>>
>> I've noted that rendering bidi text in source code is editor (or tool) dependent and have suggested that a user should set his/her preference: Override Yes or Not the bidi algorithm, so that, if Yes, punctuations in the markup  and  normal text can't interfere and give unexpected rendering.
>>     
>
> My guess is that almost always, in the above sense, the user would
> choose "override yes". Just blindly applying the Unicode bidi algorithm
> to something it's not designed to handle virtually always results in
> chaos that shows as garbage.
>
> But that's not the main problem. Even when we agree that the Unicode
> Bidi algorithm as such isn't suited for source display (be that HTML/
> XML or some programming language source or something else), there
> are many ways to do a better job, and different users may prefer
> different ways depending on their background and on the documents
> at hand.
>
>   
Sure, there are better ways to do things and it depends on users/documents.
It is also true that working directly in source code is exceptional, and 
could be done only by advanced users for some "fine tuning".

>   
>> Richard had already discussed this problem,
>> http://www.w3.org/International/geo/html-tech/tech-bidi.html#d2e277
>> but there is no satisfactory solution yet.
>>
>> Editing source code is not a usuall activity so, between the next three lines
>> <p title="ATTRIBUTE">CONTENT</p> (normal styling)
>> <p title="TNETNOC<"ETUBIRTTA</p> (some browser styling)
>> <p title="ETUBIRTTA">TNETNOC</p> (memory order)
>> I prefer the third which is much sure for inserting a space for example.
>>     
>
> Assuming the usual "upper case is RTL" convention, I fully
> understand why you prefer the third, but I don't understand
> why you label it as "memory order". The first is in memory
> order, isn't it? 
Oups.
Yes it is. I've (badly) transcripted my arabic example to latin 
uppercase. Please see attached image.

> The third is what we would like to see, but
> what as far as I understand, no editor currently does.
>   
:-(

[...]

>> The question remains: For source text, is correct bidi-rendering desirable for a given user for a given need?
>>     
>
> What do you mean by "correct bidi rendering"? 
Sorry, "correct" is not the right word.
What I wanted to say is that with **mixed LTR+RTL content, there is two 
levels of source code:
- the one with applying bidi algorihm, possibly with something else. 
(Most of actual editors)
- the one without applying bidi algorithm. (The exact source file)
and if editors could permit to switch from one level to the other, it 
would be perfect.

Regards, Najib

PS: (For my part, I always wondered why don't other languages use 
scripts written right-to-left?  :-) )


> If you mean
> "fully apply the Unicode bidi algorithm and nothing else",
> then I'd clearly say NO. If you mean "something better
> than just the Unicode bidi algorith", my answer would be
> clearly YES, but depending on the material and on personal
> preferences, there may be several ways.
>
> Although there are several ways (an Arabic user who only
> occasionally views/reads Latin would want to view an
> XML document with mostly Arabic content and mostly Arabic
> element/attribute names in overall RTL mode,...), for the
> example above, what has been done e.g. at
> http://www.w3.org/International/its/techniques/its-techniques.html#AuthDir
> seems perfectly reasonable to me, although it requires a
> highly sophisticated editor that not only parses tags,
> but the whole element structure/nesting, and understands
> its:dir. You can see how that would work at
> http://www.sw.it.aoyama.ac.jp/cgi-bin/bidi-source-test
> if you change "its:dir" to "dir" and select markup language
> xhtml (rather than xml).
>
> Regards,     Martin.
>
>
Attachments

image/jpeg attachment: bidiSourceCode.jpg
Received on Tuesday, 11 September 2007 14:21:53 UTC