W3C home > Mailing lists > Public > public-i18n-core@w3.org > October to December 2011

Rework of Bidi inline article

From: Richard Ishida <ishida@w3.org>
Date: Tue, 15 Nov 2011 15:01:52 +0000
Message-ID: <4EC27EE0.80407@w3.org>
To: "Aharon (Vladimir) Lanin" <aharon@google.com>, Matitiahu Allouche <matial@il.ibm.com>
CC: "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Hi Aharon, Mati,

I have just done a first draft pass over the document 
http://www.w3.org/International/tutorials/new-bidi-xhtml/Overview-inline.en.php 
(in particular from here down: 
http://www.w3.org/International/tutorials/new-bidi-xhtml/Overview-inline.en.php#where).

Bearing in mind that this is a first pass, would you mind scanning it 
and letting me know whether you think i'm on the right track?  Hopefully 
it responds to the structure related comments below. (I'm still planning 
to revisit some of the more detailed comments.)

btw, I haven't yet decided what to do with the section entitled "More 
examples".

Thanks!
RI


PS: Any thoughts on this: 
https://plus.google.com/103190014606131822578/posts/MgiwWfu3Rrt ?



On 28/10/2011 21:09, Richard Ishida wrote:
> I have begun a substantial reorganization and rewrite of the following
> section:
>
> http://www.w3.org/International/tutorials/new-bidi-xhtml/Overview-inline.en.php#where
>
>
>
> RI
>
>
>
> On 25/10/2011 17:58, Richard Ishida wrote:
>> Hi Aharon, and thanks for your comments. I was hoping to discuss with
>> you at the Unicode conf, but that wasn't to be, so here is a quick dash
>> at my thoughts (since I have to go out soon).
>>
>> I actually agree with pretty much everything you say, but the concern I
>> had was to do with Martin's previous post about the fact that these
>> things are not yet supported widely, and how to manage expectations in
>> that regard.
>>
>> Even where implementation is there (eg. for dir=auto on Chrome (although
>> not <bdi> afaict!)) it will be some time before the new constructs can
>> be relied upon on their own, due to legacy browser usage (esp. IE8).
>>
>> My original thought was to 'cordon off' the new stuff into its own
>> section with a big disclaimer, so that it is clear that this stuff
>> doesn't work quite yet, and then merge it in to the mainstream gradually
>> as support increases.
>>
>> However, I think you might be right that we should integrate from the
>> start. The challenge will be to do so in a way that makes it clear to
>> the reader what currently works and what doesn't.
>>
>> That said, I'm still a little worried about the legacy aspect of this.
>>
>> I've seen a few places in my own pages where I'm inclined to add
>> dir=auto or bdi right now, but I know that i will still need to also use
>> the rlm/lrm for at least a couple of years to cater for the IE8
>> corporate legacy.
>>
>> Using both will be messy, for explanation as well as for content
>> authoring.
>>
>> I'm wondering whether a way around this is to use CSS. For example, in a
>> LTR page or context, the CSS rule
>>
>> bdi:before { content: '\200E '; }
>>
>> will cause
>>
>> <p>The names of these states in Arabic are <bdi>مصر</bdi>,
>> <bdi>البحرين</bdi> and <bdi>الكويت</bdi> respectively.</p>
>>
>> to display as expected, even if bdi is not supported.
>>
>> I suspect we may need to distinguish between cases, such as input
>> fields, where the rlm/lrm is not appropriate (because it doesn't help),
>> and situations like the example above, where it can help (either for bdi
>> or dir=auto).
>>
>> Actually, the CSS should probably be genericised to say something like,
>> if the direction of the parent element is RTL use rlm, and vice versa,
>> but I think that that capability too is only now being introduced.
>>
>> What do you think?
>>
>> RI
>>
>>
>>
>> On 14/10/2011 13:04, Aharon (Vladimir) Lanin wrote:
>>> I think that the bdi element and the idea of isolation should appear
>>> much earlier in the article, long before unknown direction. Basically,
>>> when you introduce <span dir=...> in "A simple solution" (after "Nesting
>>> base direction"), you should also mention that HTML5 defines a new
>>> element, <bdi>, that should be preferred over <span> for this purpose,
>>> once browsers start to support it, because it also isolates the nested
>>> phrase from its surroundings, thus preventing it influencing their
>>> display. You can say that there are examples coming up.
>>>
>>> "Adjacent, same-direction directional runs that are incorrectly ordered"
>>> is an excellent example for the use of <bdi>. I think you should take
>>> out the sentence "Putting markup around the comma is a bit like cracking
>>> an egg with a hammer in this case." I think that mark-up generally is
>>> the preferred solution, when it states something that makes sense. As I
>>> will explain below, enclosing the comma in a <span dir=ltr> makes no
>>> sense, and should not even be mentioned, since it will not work. On the
>>> other hand, enclosing each of the RTL items in the list (but not the
>>> commas or spaces between them) in a <bdi dir=rtl> makes perfect sense,
>>> i.e.:
>>>
>>> The names of these states in Arabic are <bdi dir="rtl">مصر</bdi>, <bdi
>>> dir="rtl">البحرين</bdi> and <bdi dir="rtl">الكويت</bdi> respectively.
>>>
>>> You can say that in this example, the dir="rtl"s actually don't change
>>> anything, and in fact that just the first <bdi> is sufficient to fix the
>>> problem, but there is nothing wrong with marking every embedded
>>> opposite-direction phrase in a <bdi> - it won't hurt, and will often
>>> prevent problems.
>>>
>>> As I said before, putting a <span dir=ltr> around the comma does not
>>> make sense, and should not be mentioned at all. Why specifically the
>>> comma, and not, say the space next to it? Furthermore, a <span dir=...>
>>> is an /embedding/ - which is not really true for the comma: it's a part
>>> of the enclosing LTR sentence, not a piece of LTR embedded within - i.e.
>>> a part of - some RTL. In fact, putting the <span dir=ltr> around the
>>> comma puts the comma in the wrong place when there is no space between
>>> it and the RTL text preceding it.
>>>
>>> In "More examples", the Hebrew "W3C ... ERCIM" examples should really
>>> start with "ה-" immediately before the "W3C", i.e. the desired output
>>> should be:
>>> ה-W3C‏ (World Wide Web Consortium) מעביר את שירותי הארחה באירופה ל -
>>> ERCIM.
>>>
>>> This too is actually a great place to use <bdi>:
>>>
>>> ה-<bdi dir="ltr">W3C</bdi> (<bdi dir="ltr">World Wide Web
>>> Consortium</bdi>) מעביר את שירותי הארחה באירופה ל-<bdi
>>> dir="ltr">ERCIM</bdi>.
>>>
>>> Once again, you don't actually need the dir="ltr" on any of these, and
>>> just the first or second <bdi> will be sufficient alone to fix the
>>> problem, but in principle the safe way to write this sentence is as
>>> above.
>>>
>>> I think that the <bdi> solution - once it is available in browsers - is
>>> preferable to using &rlm;, because it makes intuitive sense. You simply
>>> mark the embedded opposite-direction phrases, each one on its own. Until
>>> someone actually understands the UBA - which very few people do - using
>>> LRM and RLM seems like voodoo. Few people know when they should use LRM
>>> and when they should use RLM, and where exactly they should put it.
>>>
>>> IMO, the same applies to all the other examples in this section. The
>>> best way to deal with them, when it becomes available, is <bdi dir=ltr>
>>> (or just <bdi>, because of dir=auto, but we don't have to mention that
>>> yet), not an LRM, and not <span dir=ltr>.
>>>
>>> In "Handling unknown text", if you are looking for a real RTL book title
>>> that contains some LTR word(s), but does not begin with them (so that
>>> dir=auto will work well with it), there is
>>> http://books.google.com/books?id=05syOwAACAAJ:
>>>
>>>
>>> מבוא לתכנות בסביבת אינטרנט - מבוא ו- HTML
>>>
>>> Please note that the Google Books page has a bug: the title as displayed
>>> at the top of the page is always in the direction of the UI. However,
>>> the title displayed near the bottome of the page, after "Title:" is
>>> displayed using the word-count direction estimation algorithm. It gets
>>> this book title right.
>>>
>>> Furthermore, please note that when I used Google Books' Advanced Search
>>> to look for Hebrew-language books containing one of the words HTML, CSS,
>>> and JavaScript, the majority of the book titles I found /began /with the
>>> LTR word, so dir=auto's first string algorithm does not work well on
>>> them. I had tried to push through word-count for dir=auto, but failed to
>>> convince people. Examples:
>>>
>>> http://books.google.com/books?id=IU83OgAACAAJ
>>> http://books.google.com/books?id=_qAlOgAACAAJ
>>> http://books.google.com/books?id=_-gSKQEACAAJ
>>>
>>> For this reason, I think it is worthwhile to tone down the statement
>>> that "There are some rare corner cases where this may not give the
>>> desired outcome, but in the majority of cases it should produce the
>>> expected result." I would take out the words "some rare", and you could
>>> also add on "particularly when the embedded text does not mix LTR and
>>> RTL words and the problem is limited to things like trailing
>>> punctuation, leading numbers, and phone numbers."
>>>
>>> On Thu, Oct 13, 2011 <tel:2011> at 8:09 PM, Richard Ishida
>>> <ishida@w3.org <mailto:ishida@w3.org>> wrote:
>>>
>>> On 19/09/2011 16:04, [Mati] wrote:
>>>
>>> http://www.w3.org/__International/tutorials/new-__bidi-xhtml/qa-html-dir.php
>>>
>>>
>>> <http://www.w3.org/International/tutorials/new-bidi-xhtml/qa-html-dir.php>
>>>
>>>
>>>
>>> 11) In section "Using dir="auto" with the input element", the first
>>>
>>> > Hebrew word of the example is not known to me and is probably a
>>> typo. I don't even guess what was the intended word.
>>>
>>>
>>> On 20/09/2011 09:38, [Mati] wrote:
>>>
>>> http://www.w3.org/__International/tutorials/new-__bidi-xhtml/Overview-inline.en.__php
>>>
>>>
>>> <http://www.w3.org/International/tutorials/new-bidi-xhtml/Overview-inline.en.php>
>>>
>>>
>>>
>>> DON'T show email on public list.
>>>
>>> Name: Matitiahu Allouche
>>> Email:matial@il.ibm.com <mailto:Email%3Amatial@il.ibm.com>
>>>
>>>
>>> Comments:
>>> This is the continuation of comments that I sent in a previous
>>> submission.
>>>
>>> 18) In section "Second use case", the first Hebrew word of the
>>> book title differs between its mention in the body of the text
>>> and its mention in the message. The form in the message is the
>>> correct one.
>>>
>>>
>>>
>>> I think I was trying to use the title of the article at
>>> http://www.w3.org/__International/questions/qa-__css-charset.he.php
>>> <http://www.w3.org/International/questions/qa-css-charset.he.php>
>>> (though why that's different, I'm not sure). But at the time I only
>>> grabbed that quickly because i was in a hurry.
>>>
>>> Would you or Aharon be able to provide me with a real book title
>>> that has similar properties? (ie. ending with CSS or some such).
>>> (Maybe one of these?
>>> http://www.google.com/search?__q=CSS3&btnG=Search+Books&tbm=__bks&tbo=1
>>> <http://www.google.com/search?q=CSS3&btnG=Search+Books&tbm=bks&tbo=1>)
>>>
>>> Cheers,
>>>
>>> RI
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Richard Ishida
>>> Internationalization Activity Lead
>>> W3C (World Wide Web Consortium)
>>>
>>> http://www.w3.org/__International/ <http://www.w3.org/International/>
>>> http://rishida.net/
>>>
>>>
>>
>

-- 
Richard Ishida
Internationalization Activity Lead
W3C (World Wide Web Consortium)

http://www.w3.org/International/
http://rishida.net/
Received on Tuesday, 15 November 2011 15:02:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 15 November 2011 15:02:40 GMT