Re: Roll-up captions in WebVTT from Glenn Maynard on 2011-12-18 (public-texttracks@w3.org from December 2011)

From: Glenn Maynard <glenn@zewt.org>
Date: Sun, 18 Dec 2011 11:20:54 -0500
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Cc: David Singer <singer@apple.com>, public-texttracks@w3.org
Message-ID: <CABirCh8i3krRCNbBgoDKMvxWBuHncVRiD9WL4eZyW8Bwm1cnYQ@mail.gmail.com>

On Sun, Dec 18, 2011 at 3:07 AM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com> wrote:
> On Sat, Dec 17, 2011 at 1:05 PM, Glenn Maynard <glenn@zewt.org> wrote:
>> It's not backwards, but if you havn't seen it in realtime and don't know
>> what I'm talking about, the above description probably isn't enough.  Let me
>> know if you want a more detailed description (or I'll dig out an example).
>
> I've seen it in some Karaoke videos, but never with actual captions.

All subtitles I've seen work that way.
https://zewt.org/~glenn/overlapped-caption-example.mpg  That's a
relatively complex example, with multiple disconnected streams of
dialogue, and I find it easy to follow while still paying attention to
the actual video.

> And to be honest, I found the need to have my eyes jump first down a
> line to read the next, then up a line to read the next etc much worse
> for reading than others where the lines move. In the first case my
> eyes have to continuously re-focus on alternative lines above and
> below, whereas in the second case my eyes will focus on what I am
> reading and move with the text motion, then jump down a line (which is
> a movement they are used to from normal reading).

I'm not looking from one caption to the next; I'm glancing down from
the video at one caption and back to the video, then at the next
caption when it shows up, then back to the video.  My eyes don't rest
on captions any longer than necessary.

(I don't doubt that different people read subtitles in very different
ways, of course.  Karaoke is also different from subtitles, since
karaoke lyrics are the primary visual focus, unlike subtitles which
avoid competing with other visuals.)

> My argument is that there are situations where scrolling captions are
> needed and they are not necessarily rare or worse for everyone to
> watch. It's a different presentation that some prefer and others
> don't.

I don't think we're talking about preferences here, though, just
authoring needs (eg. live captions), right?  Viewing preferences are a
different matter entirely--those should be up to the user, not baked
into the markup.

It's not clear to me that live captions actually *need* roll-up
captioning, or that we need any particular way to enable it in the
markup.  It seems more useful to mark up the semantic ("these captions
are live [so you may want to switch to roll-up captions]"), instead of
presentationally ("these captions are roll-ups").  The UA can then
choose, based on this flag, implementation experience and user
preferences.

> Therefore we should have a standard means of doing them rather
> than having to do awkward text copying to simulate the effect (and
> screw up all useful automated analysis of the text).

Duplicating captions would be evil, of course (think about what might
happen if the track was being output via TTS), and much more evil if
someone was trying to fake smooth scrolling (dozens of duplicates per
caption).

-- 
Glenn Maynard

Received on Sunday, 18 December 2011 16:21:22 UTC