Re: Roll-up captions in WebVTT from Silvia Pfeiffer on 2011-12-19 (public-texttracks@w3.org from December 2011)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Mon, 19 Dec 2011 22:18:38 +1100
To: Glenn Maynard <glenn@zewt.org>
Cc: David Singer <singer@apple.com>, public-texttracks@w3.org
Message-ID: <CAHp8n2nxR7fYkytPV17ZDPBPcZHmWQgQNOS+ObFVSV2BWRrdOw@mail.gmail.com>
On Mon, Dec 19, 2011 at 1:11 PM, Glenn Maynard <glenn@zewt.org> wrote:
> On Fri, Dec 16, 2011 at 7:43 PM, Glenn Maynard <glenn@zewt.org> wrote:
>> It's hard to do with live captions, since you can end up in situations where
>> you don't have any good place on screen to put a caption.  It'd be
>> interesting to try this sort of captioning with "live" captions (eg.
>> captions without carefully-edited timing information and other tweaks), and
>> see what actually happens, though.  Maybe I'm assuming it doesn't work well
>> because of existing practice, when it's actually a solvable problem.
>
> To follow up to this, now that I know what's actually going on: The
> real reason for roll-up captions is that live captions are not only
> added in realtime, they're edited in realtime.  Text is added to the
> end of a cue as the transcriber types, and they might (at least in
> principle) be edited in other ways.  That's something that pop-on
> captions simply can't deal with, since you'd end up putting a caption
> on screen, then having it end up requiring more space than it has.
> Roll-up captions just roll the whole thing up to make a new empty
> line.

While both of these are issues for live captioning, they are separate
features. Not all rollup captions have editing included and editing
could in theory be done on pop-up captions, too, though that's not
traditionally been the case.

> (Of course, while the API can probably handle this, representing this
> sort of thing in the WebVTT file format itself is well out of scope;

Why? I don't see any reasons for excluding use cases. I'm really keen
to use WebVTT as the only generic captioning format for all needs.


> WebVTT cue blocks are parsed atomically.  If someone wants to support
> realtime captioning on the web, they'll need to define a protocol to
> transmit partial cues and to handle other types of in-place edits,
> like EIA-608 very rudimentarily does.)

Not necessarily. Ian invented a <redoline> (redo-line) tag in this bug
https://www.w3.org/Bugs/Public/show_bug.cgi?id=14104 . That could be
used for rollup as well as popup captions and it is editing done
through markup.


> On Sun, Dec 18, 2011 at 7:01 PM, Silvia Pfeiffer
> <silviapfeiffer1@gmail.com> wrote:
>> I don't think you can do rollup as a preference - how would you do
>> that? I think you have to provide two different files with different
>> makrup for people to choose from if you want to support both means.
>
> For just rendering as roll-up (lines appear at the bottom, pushing old
> lines up and out), I don't see the problem.

How would the markup look such that you could render it to either
rollup or popup? Do you have an example markup that could be rendered
either way with using just UA settings?

My suggestion to solve this problem was to have a "class" on cues that
would group cues together such that they can be rendered as rollup
(see my first email in this thread). It is minimal additional markup
and would indeed allow to create rollup or popup captions from the
same content. It could be changed through preferences.


> Fully defining this would be a fair bit of work,

I don't think so.

> and there are
> questions that would need to be answered (eg. what do you do with
> captions that specify the location on screen),

The proposal that I had works for captions no matter where they are
positioned. If they are in the same position, rollup pushes them up.
If they are in different locations, there is no previously rendered
text to push up.

> but there's nothing
> requiring this to be a markup feature.

You do have to group cues together somehow to provide the dependency
that one pushes up the other one. That clearly is additional markup.


> The subtitles I'm reading are written for English speakers, very often
> by Americans, and they're all "pop-on".

They are all in the Anime space FAICT.

> Bitmap subtitles on DVDs (the
> form used by most movies) didn't even support roll-up (from what I
> recall when I implemented a decoder many years back).

DVD is never used for live and captions for live recordings on DVD
would always have been reformatted. So, I understand why that never
existed.

> All soft
> captions on YouTube and Netflix that I've seen are pop-on.

YouTube is only just now introducing rollup. They are introducing it
as a reply to a market needs. It's indeed an advanced feature that has
taken time to introduce. You will likely see a lot more of that
content on YouTube in the future.


> I don't think this is a cultural issue at all,
> and having to create separate tracks to represent each possible user
> preference (never mind the combinatoric explosion this would lead to)
> would be a very bad way forward.

I agree with that last statement. This is why I proposed to introduce
something that allows to create rollup or popon from the same file.


Cheers,
Silvia.
Received on Monday, 19 December 2011 11:19:37 UTC