Re: [whatwg] How to expose caption tracks without TextTrackCues from Philip Jägenstedt on 2014-10-23 (public-whatwg-archive@w3.org from October 2014)

From: Philip Jägenstedt <philipj@opera.com>
Date: Thu, 23 Oct 2014 14:10:10 +0200
To: Bob Lund <B.Lund@cablelabs.com>
Cc: Silvia Pfeiffer <silviapfeiffer1@gmail.com>, WHAT Working Group Mailing List <whatwg@whatwg.org>
Message-ID: <CAMQvoCmaYr8pNfAbRtPY7f05kwvBM4WD9twg75JPv0Rpsw_D=w@mail.gmail.com>

On Wed, Oct 22, 2014 at 5:33 PM, Bob Lund <B.Lund@cablelabs.com> wrote:
>
>
> On 10/22/14, 9:01 AM, "Philip Jägenstedt" <philipj@opera.com> wrote:
>
>>On Sun, Oct 12, 2014 at 11:45 AM, Silvia Pfeiffer
>><silviapfeiffer1@gmail.com> wrote:
>>>
>>> Hi all,
>>>
>>> In the Inband Text Tracks Community Group we've recently had a
>>> discussion about a proposal by HbbTV. I'd like to bring it up here to
>>> get some opinions on how to resolve the issue.
>>>
>>> (The discussion thread is at
>>>
>>>http://lists.w3.org/Archives/Public/public-inbandtracks/2014Sep/0008.html
>>> , but let me summarize it here, because it's a bit spread out.)
>>>
>>> The proposed use case is as follows:
>>> * there are MPEG-2 files that have an audio, a video and several
>>>caption tracks
>>> * the caption tracks are not in WebVTT format but in formats that
>>> existing Digital TV receivers are already capable of decoding and
>>> displaying (e.g. CEA708, DVB-T, DVB-S, TTML)
>>> * there is no intention to standardize a TextTrackCue format for those
>>> other formats (statements are: there are too many formats to deal
>>> with, a set-top-box won't need access to cues)
>>>
>>> The request was to expose such caption tracks as textTracks:
>>> interface HTMLMediaElement : HTMLElement {
>>> ...
>>>   readonly attribute TextTrackList textTracks;
>>> ...
>>> }
>>>
>>> Then, the TextTrack interface would list them as a kind="captions",
>>> but without any cues, since they're not exposed. This then allows
>>> turning the caption tracks on/off via JavaScript. However, for
>>> JavaScript it is indistinguishable from a text track that has no
>>> captions. So the suggestion was to introduce a new kind="UARendered".
>>>
>>>
>>> My suggestion was to instead treat such tracks as burnt-in video
>>> tracks (by combination with the main video track):
>>> interface HTMLMediaElement : HTMLElement {
>>> ...
>>>
>>> readonly attribute VideoTrackList videoTracks;
>>> ...
>>> }
>>>
>>> Using the VideoTrack interface it would list them as a kind="captions"
>>> and would thus also be able to be activated by JavaScript. The
>>> downside would that if you have N video tracks and m caption tracks in
>>> the media file, you'd have to expose NxM videoTracks in the interface.
>>>
>>>
>>> So, given this, should we introduce a kind="UARendered" or expose such
>>> tracks a videoTracks or is there another solution that we're
>>> overlooking?
>>
>>VideoTrackList can have at most one video track selected at a time, so
>>representing this as a VideoTrack would require some additional
>>tweaking to the model.
>>
>>A separate text track kind seems better, but wouldn't it still be
>>useful to distinguish between captions and subtitles even if the
>>underlying data is unavailable?
>
> This issue was clarified here [1]. TextTrack.mode would be set
> ³uarendered². TextTrack.kind would still reflect ³captions² or ³subtitles².
>
> [1]
> http://lists.w3.org/Archives/Public/public-whatwg-archive/2014Oct/0154.html

Oops, I missed that.

I was under the impression that the ability for scripts to detect this
situation was the motivation for a spec change. If there are multiple
tracks most likely all but one will be "disabled" initially, which
would be indistinguishable from a disabled track with no cues. Since
TextTrack.mode is mutable, even when it is initially "uarendered",
scripts would have to remember that before disabling the track, which
seems a bit inconvenient.

P.S. Your mails have an encoding problem resulting in superscript
numbers instead of quotes.

Philip

Received on Thursday, 23 October 2014 12:10:35 UTC