MPEG-2 TS Descriptors from Clift, Graham on 2014-05-23 (public-inbandtracks@w3.org from May 2014)

From: Clift, Graham <Graham.Clift@am.sony.com>
Date: Fri, 23 May 2014 16:48:17 +0000
To: "public-inbandtracks@w3.org" <public-inbandtracks@w3.org>
CC: "Ota, Takaaki" <Takaaki.Ota@am.sony.com>, "Wu, Max" <Max.Wu@am.sony.com>, "Nejat, Mike" <Mahyar.Nejat@am.sony.com>, "Masham, Samuel (CPDG)" <SamuelA.Masham@jp.sony.com>, "Day, Saneesh" <saneesh.s@ap.sony.com>
Message-ID: <7AA796C6DF5E2D43964EF86070AF33B2C56F0C7F@USCULXMSG03.am.sony.com>
Dear Community group,



I would like to propose an alternative approach for exposing the track description information that does not rely on TextTrackCue.



The reasoning am following is this. In MPEG-2 TS the program map section and table carries a lot of information that is not relevant to the application. The pid is an example. Descriptors that the application can do nothing with is another example. The application should care only about information it can act on and the media pipeline shouldn't be overly burdened with maintaining PMT cues (where any change, no matter whether relevant will need to be tracked and then the whole block exposed as a cue)



Examples of information the application cares about:

1. Knowing what kind of AudioTracks there are.

2. Knowing there are captions that it can turn on or off.

3. Knowing that a TextTrack is a certain stream type containing private data that will be presented as cues.



Outside of describing the tracks the PMT is of limited use so why are we adding complexity to creating and maintaining cues (which was primarily designed for carefully timed actions) for this data?



Instead I propose that the TextTrack, AudioTrack and VideoTrack carries the relevant information about it and any associated cues.

I believe that audioTrack and videoTrack may be sufficiently covered by the existing lang and kind attributes with the mappings already defined by . If not then we can add the TrackDescription interface to this too.



However the TextTrack is not sufficiently descriptive since the only generic attribute is inBandMetadataTrackDispatchType<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-inbandmetadatatrackdispatchtype> and so I propose that we extend TextTrack as follows:



enum TextTrackMode { "disabled<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-disabled>",  "hidden<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-hidden>",  "showing<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-showing>" };

enum TextTrackKind { "subtitles<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-kind-subtitles>",  "captions<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-kind-captions>",  "descriptions<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-kind-descriptions>",  "chapters<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-kind-chapters>",  "metadata<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-kind-metadata>" };

interface TextTrack : EventTarget<http://www.w3.org/TR/html5/infrastructure.html#eventtarget> {

  readonly attribute TextTrackKind<http://www.w3.org/TR/html5/embedded-content-0.html#texttrackkind> kind<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-kind>;

  readonly attribute DOMString label<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-label>;

  readonly attribute DOMString language<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-language>;



  readonly attribute DOMString id<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-id>;

  readonly attribute DOMString inBandMetadataTrackDispatchType<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-inbandmetadatatrackdispatchtype>;



           attribute TrackDescription description;



           attribute TextTrackMode<http://www.w3.org/TR/html5/embedded-content-0.html#texttrackmode> mode<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-mode>;



  readonly attribute TextTrackCueList<http://www.w3.org/TR/html5/embedded-content-0.html#texttrackcuelist>? cues<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-cues>;

  readonly attribute TextTrackCueList<http://www.w3.org/TR/html5/embedded-content-0.html#texttrackcuelist>? activeCues<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-activecues>;



  void addCue<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-addcue>(TextTrackCue<http://www.w3.org/TR/html5/embedded-content-0.html#texttrackcue> cue);

  void removeCue<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-removecue>(TextTrackCue<http://www.w3.org/TR/html5/embedded-content-0.html#texttrackcue> cue);



           attribute EventHandler<http://www.w3.org/TR/html5/webappapis.html#eventhandler> oncuechange<http://www.w3.org/TR/html5/embedded-content-0.html#handler-texttrack-oncuechange>;



}

\\



Interface TrackDescription {

  attribute octet kind;                   //In MPEG-2 this might be mapped to Stream_Type

  attribute ArrayBuffer    data;          //optional raw data (e.g. as contained in the MPEG-2 TS elementary stream descriptor)

  attribute DOMSTRING      text;          //Optional text version of the data.

}







>From this extension we can provide mappings of type for each kind of TextTrack we are interested in supporting in each kind of media format.



The Pros to this approach of the MPEG-2 TS Metadata Cue mapping proposed



1.       Elegant and much simpler that the proposed mapping to a cue

2.       Ties the description to the relevant track so removes need to consider the order or the PiD

3.       Expandable to other formats

4.       Less media processing requirements because no special cue handling

5.       Easier for the application developer to understand and make use of.





The Cons



1.       New HTML5 structure added on TextTrack





Regards



Graham Clift
Received on Friday, 23 May 2014 16:48:52 UTC