W3C home > Mailing lists > Public > public-i18n-bidi@w3.org > January to March 2010

RE: Additional Requirements for Bidi in HTML - Sections 3.4 and 3.5

From: CE Whitehead <cewcathar@hotmail.com>
Date: Thu, 25 Mar 2010 20:45:15 -0400
Message-ID: <SNT142-w253D2AED3934F92B4C6119B3230@phx.gbl>
To: <public-i18n-bidi@w3.org>


Hi.

For the record my browser (IE8) does assume an ltr context with mixed text 
  -- so that an explicit attribute dir is often needed in a mixed context.

 

I do have questions about $VAR (the first three characters are rtl) -- 

I suppose that, since programming languages are ltr, 
having $VAR display as $RAV -- with the $ indicating that it is a variable to the left of the rtl text  
(while VAR$ display as RAV$  -- with the $ to the right of the rtl text)  -- is intentional 

and needed in the context of ltr programming code 
(am I right?)


So I don't think the basic way of doing these should be changed.
(the source code should be processed in typing order, right?  

I know this is getting off-topic --

and I won't write code with Arabic variable names in any case as I personally have not got a need to at this point;
also in any case, this problem can be avoided by writing in Romanized Arabic text that is all ltr)

 

However if you continue
in an rtl context writing

([1],[2],[3] indicate the order that the characters were typed; these numbers were not actually typed)

$[1]RTL[1] $[2]RTL[2] $[3]RTL[3]

you get

$[1] LTR[3][$3] LTR[2]$[2] LTR[1]
I think
for the following I typed $sxr !sxr .sxr
(where s is a pharyngealized s; I'm sorry I have no Arabic keypad so I did not have a choice of things to type;
all I could do was paste) 
$уня !уня .уня 

 

Alas, I cannot overrule this with an attribute dir=rtl around the whole thing (not with an IE8 browser anyway).
(Inspired by Najib's many mixed directionality examples, I decided myself try to see what happened with th ebidi algorithm for mixed text.)

 

I  realize that number terminators in the context of anything but European numbers are resolved to neutral and then to the directionality of the adjoining text --
(http://www.unicode.org/reports/tr9/proposed.html

3.3.3, 3.3.4

"Neutral types are now resolved one level run at a time. At level run boundaries where the type of the character on the other side of the boundary is required, the type assigned to sor or eor is used.
The next phase resolves the direction of the neutrals. The results of this phase are that all neutrals become either R or L. Generally, neutrals take on the direction of the surrounding text. In case of a conflict, they take on the embedding direction.")

-- which clearly must be ltr for the characters typed;
even when the page starts with rtl text in another paragraph -- I guess because there are intervening ltr's.

 

But as I said even bracketing this with span dir=rtl did not fix it.

 

So I do strongly support having a bidi isolate for cases such as this.

 

Best,

C. E. Whitehead
cewcathar@hotmail.com

 		 	   		  
Received on Friday, 26 March 2010 00:45:49 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 26 March 2010 00:45:50 GMT