W3C home > Mailing lists > Public > public-i18n-bidi@w3.org > January to March 2012

Re: WebVTT bidi: can we have ‎ and ‏ escapes?

From: Aharon (Vladimir) Lanin <aharon@google.com>
Date: Wed, 4 Jan 2012 19:01:46 +0200
Message-ID: <CA+FsOYZ_yuayBOPmrg=diNwyzdeD1QFHP9SY5KUj-+s51rBQug@mail.gmail.com>
To: "Phillips, Addison" <addison@lab126.com>
Cc: Kent Karlsson <kent.karlsson14@telia.com>, "public-texttracks@w3.org" <public-texttracks@w3.org>, "public-i18n-bidi@w3.org" <public-i18n-bidi@w3.org>
ping?

On Thu, Dec 8, 2011 at 9:05 PM, Aharon (Vladimir) Lanin
<aharon@google.com>wrote:

> Yes. I was not suggesting LRE, RLE, and PDF because these are not really
> human-usable. Even LRM and RLM push the envelope.
>
>
> On Thu, Dec 8, 2011 at 7:21 PM, Phillips, Addison <addison@lab126.com>wrote:
>
>> Blar! I read the characters incorrectly. The marks are not sequence
>> forming, and Kent is correct.****
>>
>> ** **
>>
>> Addison****
>>
>> ** **
>>
>> Addison Phillips****
>>
>> Globalization Architect (Lab126)****
>>
>> Chair (W3C I18N WG)****
>>
>> ** **
>>
>> Internationalization is not a feature.****
>>
>> It is an architecture.****
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> *From:* Kent Karlsson [mailto:kent.karlsson14@telia.com]
>> *Sent:* Thursday, December 08, 2011 8:58 AM
>> *To:* Phillips, Addison; Aharon (Vladimir) Lanin;
>> public-texttracks@w3.org; public-i18n-bidi@w3.org
>> *Subject:* Re: WebVTT bidi: can we have &lrm; and &rlm; escapes?****
>>
>> ** **
>>
>> These marks, no, they are not terminated by anything. They are
>> freestanding.
>>
>> LRE, LRO, RLE, and RLO are terminated (by PDF), since they do start a
>> "span", but the marks don't.
>>
>>     /Kent K
>>
>>
>> Den 2011-12-08 17:48, skrev "Phillips, Addison" <addison@lab126.com>:****
>>
>> You need a third character: U+202C (PDF). Sequences starting with RLM or
>> LRM are terminated using this character. See:
>> http://www.w3.org/International/questions/qa-bidi-controls.en.php
>>
>> Addison
>>
>> Addison Phillips
>> Globalization Architect (Lab126)
>> Chair (W3C I18N WG)
>>
>> Internationalization is not a feature.
>> It is an architecture.
>>
>>
>>
>>
>> *From:* Aharon (Vladimir) Lanin [mailto:aharon@google.com<aharon@google.com>]
>>
>> *Sent:* Thursday, December 08, 2011 12:43 AM
>> *To:* public-texttracks@w3.org; public-i18n-bidi@w3.org
>> *Subject:* WebVTT bidi: can we have &lrm; and &rlm; escapes?
>>
>>
>> The WebVTT spec currently allows just three escapes: &lt;, &gt;, and
>> &amp;. Authors are expected to enter any other characters directly by
>> whatever other means they have at their disposal.
>>
>>
>>
>> I would like to suggest that an exception is needed for two more
>> characters, LRM and RLM. These are invisible characters with strong
>> directionality, LTR for one and RTL for the other. These are used in bidi
>> text in two ways:
>>
>>
>>
>> - At the start of a paragrph, one of these can be used to indicate the
>> paragraph's overall directionality in contexts where the directionality is
>> determined by the paragraph's first character with strong direction. This
>> is the default method of determining paragraph direction specified by the
>> Unicode Bidirectional Algorithm - and the *only* method allowed by the
>> current WebVTT spec. It is important to note that RTL languages fairly
>> often use "words" spelled in LTR characters, e.g. acronyms like GPS and
>> HTML (and WebVTT), as well as brand names. Occasionally, these occur as the
>> first word in a sentence or even a paragraph, and when this is the case,
>> the overall directionality of the paragraph is set incorrectly, unless one
>> puts an RLM at the beginning of the paragraph.
>>
>>
>>
>> - In bidi text, these characters provide some means of control over the
>> visual ordering of the characters. For example, to get "Mamma Mia!" to come
>> out that way - and not as "!Mamma Mia" - in RTL text, one can put an LRM
>> after the exclamation mark. In HTML, there are other means of such control,
>> such as wrapping opposite-direction phrases in <span dir=...> or in a <bdi>
>> element. But such means are absent in WebVTT.
>>
>>
>>
>> There are several reasons that I think an exception should be made for
>> these characters and escapes provided for them in WebVTT:
>>
>>
>>
>> 1. As mentioned above, WebVTT does not provide any means for controlling
>> paragraph directionality or inline directionality explicitly. Thus, the
>> author has no means but LRM and RLM for such control in a WebVTT file.
>>
>>
>>
>> 2. LRM and RLM are invisible. Entering invisible characters and editing
>> text that already contains them is confusing.
>>
>>
>>
>> 3. The existing standard Hebrew and Arabic keyboards do not provide a
>> means of generating an actual LRM or RLM. Although the Windows native
>> TextBox control provides a context menu that allows inserting various
>> special characters including LRM and RLM, and Microsoft Notepad uses
>> TextBox and thus provides the same context menu, most reasonable text
>> editors available on Windows (e.g. Notepad++) are not based on TextBox and
>> do not provide a means for generating LRM and RLM. The same, as far as I
>> know, is true for Linux (e.g. gedit).
>>
>>
>>
>> Aharon****
>>
>
>
Received on Wednesday, 4 January 2012 17:02:38 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 4 January 2012 17:02:39 GMT