Rollup captions: an analysis and suggestion from Silvia Pfeiffer on 2012-04-10 (public-texttracks@w3.org from April 2012)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Tue, 10 Apr 2012 22:54:25 +1000
To: public-texttracks@w3.org
Message-ID: <CAHp8n2mhWmx+SRYi4Wcbw3aienGvzyU2wUGDKLR2XSrNno5YPg@mail.gmail.com>
It seems that Shane's email did not make it to the list because of the
large attachment.
I'm forwarding it hereby without attachment so we can all share in his
information.
Shane, if you would like to share a link to the report, that would be
more helpful.
Regards,
Silvia.


---------- Forwarded message ----------
From: Shane Feldman <shane.feldman@nad.org>
Date: Tue, Apr 10, 2012 at 7:56 PM
Subject: Re: Rollup captions: an analysis and suggestion
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Cc: public-texttracks@w3.org


Silvia et al,

Thanks for putting together the webpage on roll-up captions and being
sensitive to user concerns about the poor quality of roll-up captions
compared to pop-on captions.

There are two concerns with live captioning, accuracy is abysmal and
the timing is always behind by a few seconds.

These statements on the W3C rollup webpage are not accurate:

> In addition, since lines are kept on screen longer than for typical pop-on captions, the reader has more time to capture the conversation, in particular if a real-time captioner has made a mistake and provides a correction in the next line.


and

> No matter their poor quality, studies surprisingly also found that users are actually split on their preference as to how they want live subtitles to be displayed: half of them actually prefer the roll-up display and half pop-on. Therefore, there is a user requirement to continue supporting roll-up caption modes.


It is more difficult to follow live captions when the words are
constantly moving up at varying speeds, and behind the action (due to
the delay) as opposed to pop-on captions where we can anticipate that
the words will not move and we have a specific time period to read the
captions. Further, with live captions, we miss much of the on-screen
action/images because we are constantly trying to keep up with the
live captions by watching the most recent line at the bottom of the
captioning box.

The "Quality in Live Subtitling" report, attached to this email and
referenced on the rollup captioning webpage discusses the situation
described describe above which are identified as the quicksand effect
and astray fixations for fast readers and regression for slow readers.
In describing rollup captions the author notes:

> all viewers waste time chasing subtitles which seem to be playing hide-and-seek with them, preventing them from watching the images.


and

> this chaotic reading pattern and the almost non-existent time left to ‘read’ the images may go some way towards explaining the poor comprehension results obtained by deaf, hard of hearing and hearing participants in the comprehension test...


and finally in referring to how much time we spend on reading the
captions as opposed to watching the images, the study found:

> in scrolling mode viewers spend most of their time bogged down in the subtitles (an average of 87.5% vs 12.5% spent on the images), whereas in block subtitles they have more time to focus on the images (an average of 67.3% on the subtitles and 32.7% on the images).


I would not take the survey results as concrete evidence of users
preferring live captions over pop-on captions. There are several
problems with the survey. First, the survey asks if users prefer
"word-for-word" captioning or "block" captioning. Could the consumer
have confused a preference for verbatim/easy-reader captions as
opposed to popon/rollup captions (consumers will pick verbatim
captioning over easy-reader captions a majority of the time)? Further,
do consumers understand the difference between popon and rollup
captions? It would be better to have an actual study that shows
consumers popon and rollup captions for the same program and then asks
them to rate their preference. In addition, this study focuses on TV
captioning only, and not the Internet. Viewing habits and preferences
on the Internet may be different than on TV. And the study notes that
most consumers think that live captioning is automatic
speech-recognition which may influence their perception of
rollup/popon captions.

Further, there is a bias in this survey when the RNID states,
"Considering that it is currently impossible to match live subtitles
with images perfectly..." which is no longer true on the Internet.
Last month at the South by Southwest (SXSW) Conference, I had the
opportunity to serve on a panel with Adobe, HBO, and Viacom
(http://schedule.sxsw.com/2012/events/event_IAP13011) where Glenn
Goldstein of Viacom revealed that his company has implemented an
automatic solution for the timing problem where roll-up captions are
converted to pop-on captions and moved up three seconds to synchronize
the captions with the audio for their web videos including Jon
Stewart's "Daily Show" one of the more popular programs in the United
States. Glenn provided a side-by-side demonstration of live captioning
with roll-ups and pop-on captioning that had been synchronized for the
same program. The ease of watching and following pop-on captions
compared to live captioning was immediately noticed by the hearing
audience. Also, the audience noticed that the lag between the spoken
audio and captions was significant. This solution applies to the
Internet only though, and I understand from Viacom that it cannot be
implemented for their TV programs; however, as the Romero study notes,
this can be addressed on TV by delaying the TV signal which is
currently done in Holland with "good results".

Finally, can you elaborate on the following statement on the rollup webpage?

> Users should at least have the opportunity to provide a preference as to how they want their captions displayed. Such a preference setting is currently not possible with WebVTT, which will never move cue text, but instead place new cue text lines either on top of already rendered text lines or fill a line below if it has become empty.


Shane

---

Shane H. Feldman
Chief Operating Officer
National Association of the Deaf
8630 Fenton Street, Suite 820
Silver Spring, MD 20910-3819
shane.feldman@nad.org



On Tue, Apr 10, 2012 at 4:27 AM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com> wrote:
>
> Hi all,
>
> Here is an in-depth analysis of the requirements and most of the
> proposals that have been made previously on this mailing list:
>
> http://www.w3.org/community/texttracks/wiki/RollupCaptions
>
> I apologize if I've missed your proposal - do add your proposals!
>
> My suggestion would be to go with something that is similar to the
> last proposal (an explicit introduction of rendering areas). This will
> also help when we want to move captions through user interaction to a
> different screen location.
>
> Cheers,
> Silvia.
>
Received on Tuesday, 10 April 2012 12:55:19 UTC