Re: Media: How would one caption this?

Hi Janina, all,

On Tue, Apr 29, 2014 at 5:16 AM, Janina Sajka <> wrote:
> Further to our conversation on the Media Subteam telecon today, and a
> further use case for Enhanced Captioning as defined in our Media
> Accessibility User Requirements ...
> Consider how one should caption ...
> The quartet <lang=it>Bella figlia dell'amore</lang> from the opera
> <lang=it>Rigoletto</lang>?
> As is so common in opera, we have four singers singing different words
> at the same time. This would be babble in daily speech, or even in the
> theater. But, the music makes it work in opera, which is why it's a
> common situation in opera.
> Here's another example with six voices:
> This is the famous sextet from the <lang=de>Mozart</lang> opera
> <lang=it>Le Nozze de Figaro</lang:

You probably can't see this, but this latter video actually has
burnt-in captions. These captions are in English, even though they are
singing in Italian.

> I suspect our Enhanced Captioning definition isn't sufficient for this
> use case inasmuch as we probably want our captions to show in two
> languages in such circumstances, the Italian original and the user's
> preferred lang in some kind of duplex display.
> So, another requirement for the ECC set?

The document at states:
[ECC-4] It needs to be possible to define timed text cues that are
allowed to overlap with each other in time and be present on screen at
the same time (e.g., those that come from speech of different
speakers), and such that are not allowed to overlap and thus cause
media playback pause to allow users to catch up with their reading.

So, the ECC set already covers display of caption cues of multiple
speakers (singers) at the same time.
Also, the HTML TextTrack API already allows for this.

As for having more than one language track active at the same time: I
don't think there is a general need from an accessibility point for
this. A hearing-impaired person's aim should be to understand what is
being spoken/sung, so giving the content to them in the language that
is their main language seems to satisfy that use case. I can, however,
see how somebody who might want to learn a foreign language (in this
case: might want to understand the foreign language that is being
sung) might want to see the direct translation between the two. I
believe, however, this latter to not be an accessibility use case.
Rather it is a more generic use case. Incidentally, it is also already
possible in HTML to write an application that can display more than
one track of captions at the same time using the existing TextTrack


Best Regards,

Received on Friday, 2 May 2014 00:49:21 UTC