Re: WebVTT bidi: should a cue be allowed to contain more than one paragraph? from Aharon (Vladimir) Lanin on 2011-12-08 (public-i18n-bidi@w3.org from October to December 2011)

From: Aharon (Vladimir) Lanin <aharon@google.com>
Date: Thu, 8 Dec 2011 09:10:27 +0200
To: Simon Pieters <simonp@opera.com>
Cc: public-texttracks@w3.org, public-i18n-bidi@w3.org
Message-ID: <CA+FsOYYSFQmnFPa4_O7MTR_DO3SosTd67TkB+JHbM4GS6b-x2g@mail.gmail.com>
Even if my particular use-case (simultaneous Hebrew and English subtitles)
is discounted, one still has to ask the question of why WebVTT explicitly
allows LFs in a cue, while at the same time explicitly barring the LF from
doing its normal job from separating the text into bidi paragraphs. Is
there anything preventing a cue from having English text before and LF and
French afterwards? Why then effectively prohibit switching from English to
Hebrew (or vice-versa)?

And even if we come to the conclusion that this is justified, we still have
the issues of ambiguity / inconsistency in the current spec, as I described
in points 1 and 2 in the original message.

Aharon

On Wed, Dec 7, 2011 at 4:22 PM, Simon Pieters <simonp@opera.com> wrote:

> On Wed, 07 Dec 2011 14:20:34 +0100, Aharon (Vladimir) Lanin <
> aharon@google.com> wrote:
>
>  The WebVTT cue text rendering rules currently require applying "the
>> Unicode
>> Bidirectional Algorithm's Paragraph Level steps" to a cue's text in order
>> to "determine the paragraph embedding level of a cue", which is then used
>> to determine the cue's direction (LTR or RTL), which is then used as the
>> basis for the cue's alignment ("start" or "end"). These "Paragraph Level
>> steps" (http://unicode.org/reports/**tr9/#The_Paragraph_Level<http://unicode.org/reports/tr9/#The_Paragraph_Level>)
>> start with a
>> requirement to "split the text into separate paragraphs", where the
>> "paragraphs are divided by the Paragraph Separator or appropriate Newline
>> Function". This is referring to the Unicode characters PS (U+2029), LF,
>> CR,
>> NEL (U+0085), and a few others. Now, a cue's text is explicitly permitted
>> to contain LF characters, which, in Unicode Bidirectional Algorithm terms,
>> separates paragraphs; I imagine that other paragraph-separating characters
>> are also allowed. So, it would seem then that a cue can contain several
>> (bidi) paragraphs.
>>
>> However, the WebVTT spec currently apparently does not want that to be the
>> case, and goes on to say that the cue text "represents a single
>> paragraph".
>> It is this restriction that currently allows it to talk about "the
>> paragraph embedding level of a cue" (emphasis on "the"), and thus the
>> direction of a cue.
>>
>> I find this specification problematic in several ways:
>>
>> 1. The Unicode Bidirectional Algorithm states that "Paragraphs may also be
>> determined by higher-level protocols: for example, the text in two
>> different cells of a table will be in different paragraphs." IMO, this
>> allows a higher level protocol - like WebVTT - to introduce paragraph
>> boundaries besides those determined by the paragraph-separating
>> characters.
>> I am not at all sure that it allows a higher-level protocol to ignore
>> paragraph boundaries already present in the cue text, which is implicit in
>> WebVTT's insistence that a cue's text "represents a single paragraph".
>> WebVTT can, of course, get rid of the paragraph-separating LFs (etc.) by
>> replacing them with other characters (e.g. LS, U+2028) before handing the
>> text over to the algorithm, but the WebVTT spec does not say to do so.
>>
>> 2. The WebVTT spec is unclear on whether the direction determined for the
>> cue is only to be used as a basis for alignment, or if the cue text is
>> actually to be *rendered* as a single paragraph in that direction. Please
>> note that paragraph boundaries, and thus paragraph direction, are crucial
>> to the correct display of bidirectional text. For example, let us consider
>> the following text, where we represent RTL characters with uppercase Latin
>> letters:
>>
>> THE FOOD WAS GOOD. HERE IS THE ADDRESS:
>> 50 main st.
>>
>> I am assuming an LF between the two lines. The correct visual ordering for
>> this text, as defined by "the Unicode Bidirectional Algorithm's Paragraph
>> Level steps" is then (ignoring the issue of alignment):
>>
>> :SSERDDA EHT SI EREH .DOOG SAW DOOF EHT
>> 50 main st.
>>
>> Note that the first line's colon is displayed on the left end. That's
>> because the first line is an RTL paragraph (as determined by the
>> "paragraph
>> level steps" because it starts with an RTL character). Also note that the
>> second line is an LTR paragraph. If the two lines were to be lumped into a
>> single RTL paragraph, e.g. by replacing the LF with an LS,
>> the result would be rather different:
>>
>> :SSERDDA EHT SI EREH .DOOG SAW DOOF EHT
>> .main st 50
>>
>> Is this what the WebVTT spec currently requires? Or does it just want the
>> correct display above, except with both lines aligned the same way?
>>
>> 3. I believe that there are use cases that require allowing a cue to
>> contain more than one (bidi) paragraph. For example, there at least used
>> to
>> be a widespread practice in Israel for Hebrew-language films to come with
>> subtitles that gave the dialogue in both the original Hebrew and in
>> English
>> translation, simultaneously on separate lines.
>>
>
> The two languages don't need to be in the same cue. They don't even need
> to be in the same file. You could have one track for Hebrew, another for
> English, and enable both.
>
> Currently WebVTT does not support declaring language at all, although
> there have been discussions to add it (per file, per block, per cue and/or
> intra cue). Use cases presented in this area might inform both the
> direction and language issues.
>
>
>  For these reasons, I would suggest to do away with the concept of cue
>> direction. A cue should be allowed to contain many (bidi) paragraphs, and
>> each paragraph to determine its own direction. So, what do we do with
>> alignment? Well, we could simply allow "start" and "end" to align each
>> paragraph independently. If that is problematic (and I am not sure that
>> this is actually the case), we could re-define "start" and "end" to mean
>> the start and end side of the first non-empty paragraph. And if we wanted
>> the application to decide which way to do it, we could define additional
>> alignment values, e.g. "first-start" and "first-end" in addition to
>> "start"
>> and "end".
>>
>> Aharon
>>
>
>
> --
> Simon Pieters
> Opera Software
>
Received on Thursday, 8 December 2011 07:11:28 UTC