In-band text track captions and subtitles

In-band Tracks CG and HTML WG members,

"Sourcing In-band Media Resource Tracks from Media Containers into HTML” [1] defines a method for using DataCue to expose MPEG-2 Transport Stream captions (CEA 708 [2]) and subtitles (SCTE 27 [3]). This same approach could be used for exposing Text Track Cues for other media containers that don’t use VTTCue. Discussion during development of the definition raised some questions about TextTrack and DataCues that might benefit from discussion in these groups.

- DataCue is currently defined in W3C HTML5 CR [4] for use on metadata text tracks. Does text need to be added to [4] to clarify that DataCue can be used for non-metadata text tracks?

- The sourcing spec [1] defines DataCue.data to contain the CEA 708 or SCTE 27 data. [2] and [3], respectively, define the rendering behavior required for these formats. Should there be a clarification in HTML specs that DataCue can be rendered by the UA as long as a rendering specification is referenced?

- There may be the implication that since DataCue is currently specified for use with metadata text tracks, then “captions" and “subtitles" text tracks that use DataCue will never be rendered by the UA. Is language needed in HTML to clarify that non-metadata TextTracks using DataCue should be rendered according to @mode state?

- The question arose whether it is ever the case where “captions”, “subtitles”, “descriptions” and “chapters” text tracks would NOT be rendered by the UA. The existing definition for UA behavior seems to imply that the UA must render these types of text tracks when TextTrack.mode is set to “showing” [5] . Does the HTML spec language need to be more explicit?

- Is it OK to have a “captions” or “subtitles” text track that that does not define a cue format, i.e. is only rendered by the UA?

A couple of alternatives to the use of DataCue for “captions” and “subtitles” text tracks were discussed.

Alternative #1: Format specific “captions” and “subtitles” cues. A CEA708Cue and SCTE27Cue could be defined that derives from DataCue.  These format specific cues would have @data attribute that would contain the raw CEA708 and SCTE27 data. Is there any advantage to such a format specific cue definition over direct use of DataCue?

Alternative #2: Translate MPEG-2 “captions” and “subtitles to WebVTT and use a derivative of VTTCue (derivative is necessary as you’d still want to make the raw, binary cue data available). CEA 708 captions could be exposed as a VTTCue derivative according to [6]. SCTE 27 subtitles are images and no mapping to VTTCue is defined (or possible?). DVB subtitles [7] also mostly uses the image alternative and would need a mapping to WebVTT.

Are there any other points to consider on this topic?

Thanks,
Bob Lund

[1] http://rawgit.com/w3c/HTMLSourcingInbandTracks/master/index.html
[2] Good explanation http://en.wikipedia.org/wiki/CEA-708. Non-free spec http://www.ce.org/Standards/Standard-Listings/R4-3-Television-Data-Systems-Subcommittee/CEA-708-D.aspx
[3] http://www.scte.org/documents/pdf/standards/SCTE_27_2011.pdf
[4] http://www.w3.org/TR/html5/embedded-content-0.html#guidelines-for-exposing-cues-in-various-formats-as-text-track-cues
[5] http://www.w3.org/TR/html5/embedded-content-0.html#text-track-model
[6] https://dvcs.w3.org/hg/text-tracks/raw-file/default/608toVTT/608toVTT.html
[7] http://www.etsi.org/deliver/etsi_en/300700_300799/300743/01.03.01_60/en_300743v010301p.pdf

Received on Tuesday, 10 June 2014 21:26:38 UTC