- From: Cyril Concolato <cyril.concolato@telecom-paristech.fr>
- Date: Wed, 11 Sep 2013 17:50:34 +0200
- To: public-html@w3.org
Hi all, The current HTML5 spec [1][2] explains how to build text tracks from ISO tracks, but only for the case where the ISO track is a timed metadata track (metx, mett). First, this does not cover all tracks which can be potentially useful in a web page (e.g. 3GPP Timed Text). Also, with the recent MPEG work on the carriage of Timed Text for TTML and WebVTT [3], I think the HTML spec should be updated (or maybe that text moved to the ISO specification). To my knowledge, it is not implemented yet by browsers. In the light of the recent and long (!!) discussions on Text Tracks, I would like to propose the following: - When possible (as indicated by Eric [5], this is not always possible), all ISO tracks, except when the handler type is 'vide', 'auxv', 'soun' or 'hint', should be exposed as TextTracks (ie. this covers the 'meta' tracks but now also 'subt' (used for TTML) or 'text' (used for WebVTT) tracks, and other tracks, see the register at [4]) - then, if the couple ISO-parser/Browser is capable of producing an equivalent WebVTT representation of the text track content (of any @kind, possibly metadata) without losing information, the @inBandMetadataTrackDispatchType is left empty and the track is populated as if it was an out-of-band WebVTT track. This would be used for example when WebVTT content is carried in ISO tracks but could be used for other formats where the mapping to WebVTT is feasible/simple. Note we could add a similar text for TTML once the TTML cues are defined. - and otherwise (if a WebVTT representation cannot be generated or generated without loss), - the TextTrack object is populated as follows: - the @kind is set to 'metadata' - the @label is set to the ISO 'track handler name' - the @id is set to the ISO track id - the @inBandMetadataTrackDispatchType contains the base64 encoded sample entry box. - and each sample produces a cue built as follows: - the id attribute is empty - the pauseOnExit attribute is set to false - the start and end time of the cue are the start and end time of the sample. - the content of the cue contains the sample data. Note: the cue content can be in .text (base64 encoded if initially binary) or if the cue interface (TextTrackCue, VTTCue or UnParsedCue or whatever the name) includes an ArrayBuffer, we should use that. Comments? Cyril [1] http://www.w3.org/html/wg/drafts/html/master/embedded-content-0.html#sourcing-in-band-text-tracks [2] http://www.w3.org/html/wg/drafts/html/master/embedded-content-0.html#guidelines-for-exposing-cues-in-various-formats-as-text-track-cues [3] http://www.w3.org/community/texttracks/2013/09/11/carriage-of-webvtt-and-ttml-in-mp4-files/ [4] http://mp4ra.org/codecs.html [5] http://lists.w3.org/Archives/Public/public-html/2013Sep/0012.html -- Cyril Concolato Maître de Conférences/Associate Professor Groupe Multimedia/Multimedia Group Telecom ParisTech 46 rue Barrault 75 013 Paris, France http://concolato.wp.mines-telecom.fr/
Received on Wednesday, 11 September 2013 15:51:00 UTC