Re: Roll-up captions in WebVTT from Simon Pieters on 2011-11-25 (public-texttracks@w3.org from November 2011)

From: Simon Pieters <simonp@opera.com>
Date: Fri, 25 Nov 2011 09:17:17 +0100
To: public-texttracks@w3.org, "Silvia Pfeiffer" <silviapfeiffer1@gmail.com>
Message-ID: <op.v5hta3uxidj3kv@simon-pieterss-macbook.local>
On Fri, 25 Nov 2011 05:51:48 +0100, Silvia Pfeiffer  
<silviapfeiffer1@gmail.com> wrote:

> Hi all,
>
> Whenever I get asked about how to implement roll-up captions in
> WebVTT, I have to make up some half-baked solution.
>
> I'd like us to come up with an improvement to WebVTT that takes proper
> care of this issue (and possibly other issues).
>
> In the "Caption Model" document at
> http://www.w3.org/community/texttracks/wiki/Caption_Model#4._Caption_Text_Block_Display
> we described what "roll-up captions" are: lines of text are drawn
> successively into the same text rendering box and removed from it,
> too.
>
> When trying to specify it in WebVTT, I usually suggest the following  
> approach:
>
> ==
> WEBVTT
>
> 1
> 00:01:07.395 --> 00:01:10.246
> Hey!
>
> 2
> 00:01:10.246 --> 00:01:17.000
> Hey!
> <c .vtt_blue>You there!</c>
>
> 3
> 00:01:17.000 --> 00:01:20.000
> <c .vtt_blue>You there!</c>
> What did you say?
> ==
>
> This will create the right rendering with text moving up over time and
> the top line disappearing, as long as the cues are positioned at the
> same video viewport location.
>
> However, there are several problems with this approach:
> * text that is unchanged has to be included multiple times (if 4 or 5
> lines are used, it may be repeated as often as 5 times)
> * all the markup on the text has to be repeated in every single cue
> * finally, it's not possible to address with a single CSS statement
> all the cues that relate to the same position - you only have the
> choice of all cues (with ::cue) or those of a particular id (with eg.
> ::cue(#3)) or of a particular markup (eg. ::cue(c), ::cue(c.vtt_blue),
> or ::cue(v[voice='fred'])).
>
>
> The issue:
> As I see it, the problem is that we don't currently represent the
> concept of cue text rendering boxes that persist over time. These are
> cues that are rendered in the same location but at different times
> along the video's timeline. We are not currently able to group such
> cues and identify them as being a "continuation", as belonging
> together.
>
> Or in other words: WebVTT doesn't currently have a concept that
> represents what CEA708 calls "windows" (see
> http://en.wikipedia.org/wiki/CEA-708#How_to_interpret_the_caption_stream,
> though the term "window" is not properly explained there;
> http://www.cpcweb.com/hdtv/708.htm may be more readable).
>
>
> Proposed solution:
> In discussions with others, we've come up with several means of
> introducing the concept of "rendering boxes" that persist over time.
>
> My favorite solution and hereby my proposal is to introduce a "class"
> markup on cues (rather than on fragments of cue text). This is
> motivated by the ideas of CSS which already have classes as a grouping
> mechanism for different rendering areas on the page and just extends
> this concept to the time dimension also. Thus, this allows grouping of
> cues that belong together as a "continuation" of each other.
>
> For example:
>
> ===
> WEBVTT
>
> 1.rollup
> 00:01:07.395 --> 00:01:17.000
> Hey!
>
> 2.rollup
> 00:01:10.246 --> 00:01:20.000
> <c .vtt_blue>You there!</c>
>
> 3.rollup
> 00:01:17.000 --> 00:01:20.000
> What did you say?
> ===

I just want to bikeshed the syntax and would prefer if it were a cue  
setting instead of part of the id line.

1
00:01:07.395 --> 00:01:17.000 rollup:foo
Hey!

In the ::cue() selector matching, the "Lists of WebVTT Node Objects" can  
have classes defined by the "rollup" setting (only a single class per cue).

> The text that is added to a cue of the same class is added below
> (which is the normal scrolling behaviour of text). Thus, this markup
> has the same effect as the markup given above. But it has some massive
> advantages.
>
> Advantages:
> * text does not have to be repeated
> * markup of text does not have to be repeated
> * we can address the cues that belong into the same rendering box
> through one CSS statement: e.g. ::cue(.rollup)
> * when implementing this, only a cue with a new class (or no class)
> create a new rendering area ("div")
> * the rendering continuation between cues can be upheld even when
> there are several other cues in the middle that don't belong to the
> same continuation
> * the rendering continuation between cues can be upheld even if the
> rendering area's cue settings change (e.g. if the rollup has to move
> from the bottom to the top of the viewport because there is some
> burnt-in text visible at the bottom of the screen that would be
> obstructed by the caption text)
>
> I can't really see any problems with this approach other than an extra
> restriction to the identifier parsing, which now cannot contain a "."
> any more. Did I miss anything?
>
> Cheers,
> Silvia.


-- 
Simon Pieters
Opera Software
Received on Friday, 25 November 2011 08:15:33 UTC