- From: Aharon (Vladimir) Lanin <aharon@google.com>
- Date: Thu, 8 Dec 2011 21:05:03 +0200
- To: "Phillips, Addison" <addison@lab126.com>
- Cc: Kent Karlsson <kent.karlsson14@telia.com>, "public-texttracks@w3.org" <public-texttracks@w3.org>, "public-i18n-bidi@w3.org" <public-i18n-bidi@w3.org>
- Message-ID: <CA+FsOYaKGO0xJTzr4YrM7h443iiz3KG8sNmgUbzOezrHa7Z_DQ@mail.gmail.com>
Yes. I was not suggesting LRE, RLE, and PDF because these are not really human-usable. Even LRM and RLM push the envelope. On Thu, Dec 8, 2011 at 7:21 PM, Phillips, Addison <addison@lab126.com>wrote: > Blar! I read the characters incorrectly. The marks are not sequence > forming, and Kent is correct.**** > > ** ** > > Addison**** > > ** ** > > Addison Phillips**** > > Globalization Architect (Lab126)**** > > Chair (W3C I18N WG)**** > > ** ** > > Internationalization is not a feature.**** > > It is an architecture.**** > > ** ** > > ** ** > > ** ** > > *From:* Kent Karlsson [mailto:kent.karlsson14@telia.com] > *Sent:* Thursday, December 08, 2011 8:58 AM > *To:* Phillips, Addison; Aharon (Vladimir) Lanin; public-texttracks@w3.org; > public-i18n-bidi@w3.org > *Subject:* Re: WebVTT bidi: can we have ‎ and ‏ escapes?**** > > ** ** > > These marks, no, they are not terminated by anything. They are > freestanding. > > LRE, LRO, RLE, and RLO are terminated (by PDF), since they do start a > "span", but the marks don't. > > /Kent K > > > Den 2011-12-08 17:48, skrev "Phillips, Addison" <addison@lab126.com>:**** > > You need a third character: U+202C (PDF). Sequences starting with RLM or > LRM are terminated using this character. See: > http://www.w3.org/International/questions/qa-bidi-controls.en.php > > Addison > > Addison Phillips > Globalization Architect (Lab126) > Chair (W3C I18N WG) > > Internationalization is not a feature. > It is an architecture. > > > > > *From:* Aharon (Vladimir) Lanin [mailto:aharon@google.com<aharon@google.com>] > > *Sent:* Thursday, December 08, 2011 12:43 AM > *To:* public-texttracks@w3.org; public-i18n-bidi@w3.org > *Subject:* WebVTT bidi: can we have ‎ and ‏ escapes? > > > The WebVTT spec currently allows just three escapes: <, >, and > &. Authors are expected to enter any other characters directly by > whatever other means they have at their disposal. > > > > I would like to suggest that an exception is needed for two more > characters, LRM and RLM. These are invisible characters with strong > directionality, LTR for one and RTL for the other. These are used in bidi > text in two ways: > > > > - At the start of a paragrph, one of these can be used to indicate the > paragraph's overall directionality in contexts where the directionality is > determined by the paragraph's first character with strong direction. This > is the default method of determining paragraph direction specified by the > Unicode Bidirectional Algorithm - and the *only* method allowed by the > current WebVTT spec. It is important to note that RTL languages fairly > often use "words" spelled in LTR characters, e.g. acronyms like GPS and > HTML (and WebVTT), as well as brand names. Occasionally, these occur as the > first word in a sentence or even a paragraph, and when this is the case, > the overall directionality of the paragraph is set incorrectly, unless one > puts an RLM at the beginning of the paragraph. > > > > - In bidi text, these characters provide some means of control over the > visual ordering of the characters. For example, to get "Mamma Mia!" to come > out that way - and not as "!Mamma Mia" - in RTL text, one can put an LRM > after the exclamation mark. In HTML, there are other means of such control, > such as wrapping opposite-direction phrases in <span dir=...> or in a <bdi> > element. But such means are absent in WebVTT. > > > > There are several reasons that I think an exception should be made for > these characters and escapes provided for them in WebVTT: > > > > 1. As mentioned above, WebVTT does not provide any means for controlling > paragraph directionality or inline directionality explicitly. Thus, the > author has no means but LRM and RLM for such control in a WebVTT file. > > > > 2. LRM and RLM are invisible. Entering invisible characters and editing > text that already contains them is confusing. > > > > 3. The existing standard Hebrew and Arabic keyboards do not provide a > means of generating an actual LRM or RLM. Although the Windows native > TextBox control provides a context menu that allows inserting various > special characters including LRM and RLM, and Microsoft Notepad uses > TextBox and thus provides the same context menu, most reasonable text > editors available on Windows (e.g. Notepad++) are not based on TextBox and > do not provide a means for generating LRM and RLM. The same, as far as I > know, is true for Linux (e.g. gedit). > > > > Aharon**** >
Received on Thursday, 8 December 2011 19:06:03 UTC