Re: Roll-up captions in WebVTT

On Dec 19, 2011, at 0:10 , Gal Klein wrote:

> Hi All,
> 
> As I was posting in previous emails, we have been doing LIVE caption for online video for a while now.
> We never use roll-up captions as they make it very difficult to follow with the video (there have been many articles stating the roll-up caption obscure the video viewage).

well, that's a different issue.  we used to have to put captions over the video, because that was the only screen real-estate there was on a TV.  On the web, there are choices (e.g. below/above).

The classic example is sports; the top third of the picture is sky, the middle third the stands and spectators, and the action takes place in the bottom third.  and where do we classically put captions? bottom third, on top of the action :-(


> Pop-up caption can be transmitted in Real Time with just some basic logics that can be inserted to an online caption tool.
> Will be happy to explore this with the team and also show you examples (like the WSJ front page 4 times a day), improved synchronization is also possible with some additional effort, and we are working on implementing this as well.
> 
> Best,
> 
> Gal
> 
> 
> -----Original Message-----
> From: Glenn Maynard [mailto:glenn@zewt.org] 
> Sent: Monday, December 19, 2011 4:11 AM
> To: Silvia Pfeiffer
> Cc: David Singer; public-texttracks@w3.org
> Subject: Re: Roll-up captions in WebVTT
> 
> On Fri, Dec 16, 2011 at 7:43 PM, Glenn Maynard <glenn@zewt.org> wrote:
>> It's hard to do with live captions, since you can end up in situations 
>> where you don't have any good place on screen to put a caption.  It'd 
>> be interesting to try this sort of captioning with "live" captions (eg.
>> captions without carefully-edited timing information and other 
>> tweaks), and see what actually happens, though.  Maybe I'm assuming it 
>> doesn't work well because of existing practice, when it's actually a solvable problem.
> 
> To follow up to this, now that I know what's actually going on: The real reason for roll-up captions is that live captions are not only added in realtime, they're edited in realtime.  Text is added to the end of a cue as the transcriber types, and they might (at least in
> principle) be edited in other ways.  That's something that pop-on captions simply can't deal with, since you'd end up putting a caption on screen, then having it end up requiring more space than it has.
> Roll-up captions just roll the whole thing up to make a new empty line.
> 
> In retrospect this seems obvious, but I figured I should mention it since it didn't occur to me and nobody corrected me.
> 
> (Of course, while the API can probably handle this, representing this sort of thing in the WebVTT file format itself is well out of scope; WebVTT cue blocks are parsed atomically.  If someone wants to support realtime captioning on the web, they'll need to define a protocol to transmit partial cues and to handle other types of in-place edits, like EIA-608 very rudimentarily does.)
> 
> 
> On Sun, Dec 18, 2011 at 7:01 PM, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote:
>> I don't think you can do rollup as a preference - how would you do 
>> that? I think you have to provide two different files with different 
>> makrup for people to choose from if you want to support both means.
> 
> For just rendering as roll-up (lines appear at the bottom, pushing old lines up and out), I don't see the problem.  The markup specifies the contents; the user preference determines the mode of rendering.
> 
> Fully defining this would be a fair bit of work, and there are questions that would need to be answered (eg. what do you do with captions that specify the location on screen), but there's nothing requiring this to be a markup feature.
> 
>> I believe it may be a cultural issue whether you prefer one style or 
>> the other: in the US, rollup seems more natural and your examples all 
>> seem to be from Japan, so I assume there it's more natural to swap out 
>> lines. So, as an author, you'd create the captions in the appropriate 
>> way for each language.
> 
> The subtitles I'm reading are written for English speakers, very often by Americans, and they're all "pop-on".  Bitmap subtitles on DVDs (the form used by most movies) didn't even support roll-up (from what I recall when I implemented a decoder many years back).  All soft captions on YouTube and Netflix that I've seen are pop-on.  All hard captions I can find on YouTube are pop-on (found randomly: Indian:
> http://www.youtube.com/watch?v=achG2zZTxbA; Scandinavian:
> http://www.youtube.com/watch?v=d6-yP7Kh6fY#t=2m).  I strongly suspect the same for Blu-ray releases, but I don't own any to check.  I see pop-on all over the place, and I can't even remember the last time I saw roll-up captions.
> 
> I think you're mistaken about roll-up captions being "more natural" or more common in the US.  I don't think this is a cultural issue at all, and having to create separate tracks to represent each possible user preference (never mind the combinatoric explosion this would lead to) would be a very bad way forward.
> 
> --
> Glenn Maynard
> 
> 

David Singer
Multimedia and Software Standards, Apple Inc.

Received on Monday, 19 December 2011 16:05:25 UTC