- From: Aharon (Vladimir) Lanin <aharon@google.com>
- Date: Wed, 7 Dec 2011 15:20:34 +0200
- To: public-texttracks@w3.org
- Message-ID: <CA+FsOYZ7+ucX+2FVW7snk3AiP4pe4LW5Zep5hTPTHBuzrzfHRg@mail.gmail.com>
The WebVTT cue text rendering rules currently require applying "the Unicode Bidirectional Algorithm's Paragraph Level steps" to a cue's text in order to "determine the paragraph embedding level of a cue", which is then used to determine the cue's direction (LTR or RTL), which is then used as the basis for the cue's alignment ("start" or "end"). These "Paragraph Level steps" (http://unicode.org/reports/tr9/#The_Paragraph_Level) start with a requirement to "split the text into separate paragraphs", where the "paragraphs are divided by the Paragraph Separator or appropriate Newline Function". This is referring to the Unicode characters PS (U+2029), LF, CR, NEL (U+0085), and a few others. Now, a cue's text is explicitly permitted to contain LF characters, which, in Unicode Bidirectional Algorithm terms, separates paragraphs; I imagine that other paragraph-separating characters are also allowed. So, it would seem then that a cue can contain several (bidi) paragraphs. However, the WebVTT spec currently apparently does not want that to be the case, and goes on to say that the cue text "represents a single paragraph". It is this restriction that currently allows it to talk about "the paragraph embedding level of a cue" (emphasis on "the"), and thus the direction of a cue. I find this specification problematic in several ways: 1. The Unicode Bidirectional Algorithm states that "Paragraphs may also be determined by higher-level protocols: for example, the text in two different cells of a table will be in different paragraphs." IMO, this allows a higher level protocol - like WebVTT - to introduce paragraph boundaries besides those determined by the paragraph-separating characters. I am not at all sure that it allows a higher-level protocol to ignore paragraph boundaries already present in the cue text, which is implicit in WebVTT's insistence that a cue's text "represents a single paragraph". WebVTT can, of course, get rid of the paragraph-separating LFs (etc.) by replacing them with other characters (e.g. LS, U+2028) before handing the text over to the algorithm, but the WebVTT spec does not say to do so. 2. The WebVTT spec is unclear on whether the direction determined for the cue is only to be used as a basis for alignment, or if the cue text is actually to be *rendered* as a single paragraph in that direction. Please note that paragraph boundaries, and thus paragraph direction, are crucial to the correct display of bidirectional text. For example, let us consider the following text, where we represent RTL characters with uppercase Latin letters: THE FOOD WAS GOOD. HERE IS THE ADDRESS: 50 main st. I am assuming an LF between the two lines. The correct visual ordering for this text, as defined by "the Unicode Bidirectional Algorithm's Paragraph Level steps" is then (ignoring the issue of alignment): :SSERDDA EHT SI EREH .DOOG SAW DOOF EHT 50 main st. Note that the first line's colon is displayed on the left end. That's because the first line is an RTL paragraph (as determined by the "paragraph level steps" because it starts with an RTL character). Also note that the second line is an LTR paragraph. If the two lines were to be lumped into a single RTL paragraph, e.g. by replacing the LF with an LS, the result would be rather different: :SSERDDA EHT SI EREH .DOOG SAW DOOF EHT .main st 50 Is this what the WebVTT spec currently requires? Or does it just want the correct display above, except with both lines aligned the same way? 3. I believe that there are use cases that require allowing a cue to contain more than one (bidi) paragraph. For example, there at least used to be a widespread practice in Israel for Hebrew-language films to come with subtitles that gave the dialogue in both the original Hebrew and in English translation, simultaneously on separate lines. For these reasons, I would suggest to do away with the concept of cue direction. A cue should be allowed to contain many (bidi) paragraphs, and each paragraph to determine its own direction. So, what do we do with alignment? Well, we could simply allow "start" and "end" to align each paragraph independently. If that is problematic (and I am not sure that this is actually the case), we could re-define "start" and "end" to mean the start and end side of the first non-empty paragraph. And if we wanted the application to decide which way to do it, we could define additional alignment values, e.g. "first-start" and "first-end" in addition to "start" and "end". Aharon
Received on Wednesday, 7 December 2011 13:21:24 UTC