RE: New First Public Working Draft: Additional Requirements for Bidi in HTML‏ from CE Whitehead on 2010-03-12 (public-i18n-bidi@w3.org from January to March 2010)

From: CE Whitehead <cewcathar@hotmail.com>
Date: Fri, 12 Mar 2010 15:16:33 -0500
To: <public-i18n-bidi@w3.org>
Message-ID: <SNT142-w50F5EAD379BC164A0F40DEB3310@phx.gbl>
 

Hi.  Regarding my previous message:

From: CE Whitehead <cewcathar@hotmail.com> 
Date: Thu, 11 Mar 2010 20:49:51 -0500
Message-ID: <SNT142-w109C33CB7056082F09437EB3310@phx.gbl> 
To: <public-i18n-bidi@w3.org> 

Sorry, I finally realized that the problem with my page was in my following the directive to set the overall direction at the top--

ie worked out the bidi beautifully for the English without the overall directionality specified; the page rtl directionality had been inherited and that was why the English displayed so strangely;
I don't know why I thought there was a bug.
It is just that I automatically set the directionality automatically and then forgot about its influence;

I was perplexed so long because I'd set language tags to en, everything;
sorry to trouble the list with that.
I.E. performs fine as far as directionality so far as I can tell; my mistake.
(Also sorry I guess R. I. is just one of the authors of this, but not the only one.)


Regarding now my comments on secton 3.3:
text separated by a block embedded in it is not necessarily meant to be treated bidi wise as separate runs of text;
I embed lists, paragraphs, etc. inside of divs and I may or may not want the text that comes before and after to be separated.  (Sometimes I do; but with lists particularly I may not; so if there are neutral characters following an embedded block their should be inferred by the directionality of text preceding the block when both the text precedign and text following have mixed strong rtl and ltr characters and when there is not a significant difference in the number of rtl versus ltr in the second run; would this work?.)

1.2
For example, "10 main st." is displayed in RTL as
.main st 10
{COMMENT: to make this clearer}
you should perhaps add:
=> 
"instead  of the intended 'main st. 10'"
* * *
2.1

The Problem par 9 last sentence
"This is, in fact, what WebKit currently does (although it is now being treated as a bug). "
{ COMMENT +1}
* * *
2.1

From: Najib Tounsi <ntounsi@gmail.com> 
Date: Fri, 05 Mar 2010 16:56:29 +0000
To: public-i18n-bidi@w3.org 
I agree that LRM and RLM need definitions.
> §2.1
> "The Problem
> Most documents contain a large number of self-contained entities whose 
> content must not influence the directional rendering of what precedes or 
> follows them."
> I would see "must not BE influenceD BY the directional rendering of what 
> precedes or follows them.", since we expect "such an entity to be 
> displayed visually between what precedes it and what follows it", and so 
> not to be influenced by them. No?
{ COMMENT on LANGUAGE/PROOFREADING:  Najib's comment is right, but in some cases also a description or summary in a particular language should not influence the directional rendering of surrounding content either; so sometimes 'must not influence' is correct but, yes, generally, 'must not be influenced by' is the main thing. }
* * *
2.1
par 10
"To avoid this problem, IE apparently re-opens the directional embedding levels specified on ancestor elements via mark-up (dir attribute, <bdo> element) or CSS up to the closest ancestor block element after closing them at a <br> paragraph break. On the other hand, it does not reopen the directional embedding levels stemming from surrounding LRE/RLE/LRO/RLO and PDF characters. "
{COMMENT on CONTENT:  I like IE's solution;
I've got to think about the proposed solution on the other hand.  I do like having an element with markup instead of rlm and lrm characters however--
markup is going to help facilitate searching and text-matching, you all are right in that.}
* * 
2.2 Support auto-direction
Fwd: Re: FPWD of Additional Requirements for Bidi in HTML
From: Tab Atkins Jr. <jackalmage@gmail.com> 
> Section 2.2: I highly doubt exposing the estimation algorithm to the
> author is useful here.  I am 100% certain that virtually no author
> will understand the significant differences between them, and be able
> to provide an informed decision on the appropriate value.
> Auto-detection is good (especially when combined with @bdi to limit
> the damage), but all you should have to do is say 'auto'.
{COMMENTS on CONTENT:  I disagree with Mr. Atkins here; I prefer:  dir-auto-estimated_by_wordcount or something similar; 
I think web users are savvy and need to be made aware of underlying processes whenever possible
(this was the first time I saw the estimation algorithm, but it made sense; and people who type alternately in ltr and rtl contexts no doubt wonder about their overlying directionality, and about how an app decides which way to move the curser,  and so on);
I think we increase the digital divide if we do not name the way the directionality was calculated for a dir-auto specification.}
* * *
2.3 last par
Then, if the user typed in the LTR value "hello", the submission URL would be "foo?mytest=hello&mytest_dir=ltr".

{COMMENT:  I still want to see how the direction was auto-calculated} 
=> ?
 "foo?mytest=hello&mytest_dir=auto-?NAMEOFALGORITHM: ltr". 
* * *
Open Issues
"What dir attribute values should be used to specify the word-count and first-string estimation algorithms? One possibility would be simply "word-count" and "first-strong". Or should they both start with the word "auto", i.e "auto-word-count" and "auto-first-strong". "
{COMMENT on CONTENT: auto-word-count and auto-first-strong for sure, just as we have auto color, etc. thanks.}


"Is it really truly essential to support both the word-count and the first-strong algorithm? Using just one algorithm would reduce confusion; there would be just one new dir attribute value, the easy-to-understand "auto". "


{COMMENT on CONTENT: I think in some applications, in some situations, first-strong will prove more useful and in others word-count more useful.  In addition, you have no data, absolutely none, saying when or how often one or another is more useful; such data has to be gathered and I see no reason to not support both
methods while it is gathered; and I suspect that in some cases one algorithm will work and in some cases another always. }

* * *
2.2
"Path or URL that includes consecutive RTL folder or file names (one would expect the path components to proceed in a uniform direction) "
{COMMENT on CONTENT:I am not sure but personally:  if the whole URL is turned around with the http to the right of the window,
then I support having the files ordered right-to-left;
otherwise I would expect to see the files themselves displayed in order left-to-right but the rtl names in them  right-to-left--
but I haven't real experience looking at IDN's.  This is just a personal preference}
* * *
 
3.3:
"Include among them the paragraph element, &lt;p&gt;. It seems reasonable to expect the insertion of a paragraph to break the text before it and the text after it into two UBA paragraphs.
" . . .
"Proposed solution
"Elements with block display should be specified as introducing a UBA paragraph break between the text preceding and following them."
{ COMMENT on CONTENT:  I think that anything surrounded by markup is separate by default--and agree that "bidi=yes" is the best default solution for any inline element; however since paragraphs can be embedded in other paragraphs and divs, I do not agree that the text before is by default a separate run from the text after--not unless there is some evidence that it is (that is, unless strong directionality characters are present in one run that are not present in another or something; if there are neutral characters in the text following an embedded block, these should first take their directionality from any explicity declared directionality, second take their directionality from the surrounding characters following the embedded block, and third take their directionality from the text preceding the embedded block)}
* * *
3.3
"Proposed solution"
par 2
" in accordince with "
{COMMENT on PROOFREADING:  spelling}
=> "in accordance with"
* * *
3.3
"Proposed solution"
par 3
"The text inside an elements "
{COMMENT on PROOFREADING:  agreement error 'an' takes a singular noun}
=>
"=> "an element"


* * *
"The text inside an elements with inline-block display should constitute a UBA paragraph, but probably should not introduce a UBA paragraph break. Instead, it can default to bdi="yes". This and the other types of display should be given more thorough investigation. "
{COMMENT on Content:  yes; this is the best solution for this; this is the best default for embedded inline elements too if there is an rtl attribute set; for embedded block elements I still think directionality of the text that follows can be inferred from the directionality from the text the precedes it if there is need--but if there are neutral characters in the text following an embedded block, these should first take their directionality from any explicity declared directionality, second take their directionality from the surrounding characters following the embedded block, and third take their directionality from the text preceding the embedded block.}
* * *
Best,
--C. E. Whitehead
cewcathar@hotmail.com
Received on Friday, 12 March 2010 20:17:06 UTC