- From: Kent Karlsson <kent.karlsson14@telia.com>
- Date: Thu, 08 Dec 2011 17:58:29 +0100
- To: "Phillips, Addison" <addison@lab126.com>, "Aharon (Vladimir) Lanin" <aharon@google.com>, "public-texttracks@w3.org" <public-texttracks@w3.org>, "public-i18n-bidi@w3.org" <public-i18n-bidi@w3.org>
- Message-ID: <CB06AB45.1C40B%kent.karlsson14@telia.com>
These marks, no, they are not terminated by anything. They are freestanding. LRE, LRO, RLE, and RLO are terminated (by PDF), since they do start a "span", but the marks don't. /Kent K Den 2011-12-08 17:48, skrev "Phillips, Addison" <addison@lab126.com>: > You need a third character: U+202C (PDF). Sequences starting with RLM or LRM > are terminated using this character. See: > http://www.w3.org/International/questions/qa-bidi-controls.en.php > > Addison > > Addison Phillips > Globalization Architect (Lab126) > Chair (W3C I18N WG) > > Internationalization is not a feature. > It is an architecture. > > > > > From: Aharon (Vladimir) Lanin [mailto:aharon@google.com] > Sent: Thursday, December 08, 2011 12:43 AM > To: public-texttracks@w3.org; public-i18n-bidi@w3.org > Subject: WebVTT bidi: can we have ‎ and ‏ escapes? > > > The WebVTT spec currently allows just three escapes: <, >, and &. > Authors are expected to enter any other characters directly by whatever other > means they have at their disposal. > > > > I would like to suggest that an exception is needed for two more characters, > LRM and RLM. These are invisible characters with strong directionality, LTR > for one and RTL for the other. These are used in bidi text in two ways: > > > > - At the start of a paragrph, one of these can be used to indicate the > paragraph's overall directionality in contexts where the directionality is > determined by the paragraph's first character with strong direction. This is > the default method of determining paragraph direction specified by the Unicode > Bidirectional Algorithm - and the *only* method allowed by the current WebVTT > spec. It is important to note that RTL languages fairly often use "words" > spelled in LTR characters, e.g. acronyms like GPS and HTML (and WebVTT), as > well as brand names. Occasionally, these occur as the first word in a sentence > or even a paragraph, and when this is the case, the overall directionality of > the paragraph is set incorrectly, unless one puts an RLM at the beginning of > the paragraph. > > > > - In bidi text, these characters provide some means of control over the visual > ordering of the characters. For example, to get "Mamma Mia!" to come out that > way - and not as "!Mamma Mia" - in RTL text, one can put an LRM after the > exclamation mark. In HTML, there are other means of such control, such as > wrapping opposite-direction phrases in <span dir=...> or in a <bdi> element. > But such means are absent in WebVTT. > > > > There are several reasons that I think an exception should be made for these > characters and escapes provided for them in WebVTT: > > > > 1. As mentioned above, WebVTT does not provide any means for controlling > paragraph directionality or inline directionality explicitly. Thus, the author > has no means but LRM and RLM for such control in a WebVTT file. > > > > 2. LRM and RLM are invisible. Entering invisible characters and editing text > that already contains them is confusing. > > > > 3. The existing standard Hebrew and Arabic keyboards do not provide a means of > generating an actual LRM or RLM. Although the Windows native TextBox control > provides a context menu that allows inserting various special characters > including LRM and RLM, and Microsoft Notepad uses TextBox and thus provides > the same context menu, most reasonable text editors available on Windows (e.g. > Notepad++) are not based on TextBox and do not provide a means for generating > LRM and RLM. The same, as far as I know, is true for Linux (e.g. gedit). > > > > Aharon >
Received on Thursday, 8 December 2011 16:59:16 UTC