RE: MPEG-2 TS Descriptors from Clift, Graham on 2014-05-23 (public-inbandtracks@w3.org from May 2014)

From: Clift, Graham <Graham.Clift@am.sony.com>
Date: Fri, 23 May 2014 18:44:29 +0000
To: Bob Lund <B.Lund@CableLabs.com>, "public-inbandtracks@w3.org" <public-inbandtracks@w3.org>
CC: "Ota, Takaaki" <Takaaki.Ota@am.sony.com>, "Wu, Max" <Max.Wu@am.sony.com>, "Masham, Samuel (CPDG)" <SamuelA.Masham@jp.sony.com>
Message-ID: <7AA796C6DF5E2D43964EF86070AF33B2C56F1552@USCULXMSG03.am.sony.com>
From: Bob Lund [mailto:B.Lund@CableLabs.com]
Sent: Friday, May 23, 2014 11:28 AM
To: Clift, Graham; public-inbandtracks@w3.org
Cc: Ota, Takaaki; Wu, Max; Nejat, Mike; Masham, Samuel (CPDG); Day, Saneesh
Subject: Re: MPEG-2 TS Descriptors

Graham,

I agree there kind, language and label contain all the needed metadata for audio/video and that there is information in the PMT that is of no interest to Web apps. With that said, te inBandMetadataTrackDispatchType [1] contains the elementary stream type and elementary stream descriptors. How is the information you're proposing in your new TrackDescription attribute different from inBandMetadataTrackDispatchType?

[GAC] I'm okay with inBandMetadataTrackDispatchType instead of a separate Description API. I had forgotton it included the ES descriptor too.


On a related topic, there are program level descriptors not associated with any elementary stream that might be important to an application, e.g. content_advisory_descriptor. MPEG-2 TS defines PID 0x02 as the Transport Stream Description Table elementary stream . The section data in this stream are the program level descriptors.  This data can change over the course of a program. This PID should also be exposed as a text track whose Cues are the descriptors.


[GAC] This approach is reasonable too. However, maybe instead of adding cues to the track (which has meaningless timing info and it burdens the UA with unnecessary cue lifetime maintenance) could we simply have one track per program level descriptor?

The combination of inBandMetadataTrackDispatchType for each metadata text track and the Transport Stream Description text track will expose all elementary stream and program descriptors without requiring a text track for the PMT.


Bob

[1] http://www.w3.org/TR/html5/embedded-content-0.html#steps-to-expose-a-media-resource-specific-text-track

From: <Clift>, Graham <Graham.Clift@am.sony.com<mailto:Graham.Clift@am.sony.com>>
Date: Friday, May 23, 2014 at 10:48 AM
To: "public-inbandtracks@w3.org<mailto:public-inbandtracks@w3.org>" <public-inbandtracks@w3.org<mailto:public-inbandtracks@w3.org>>
Cc: "Ota, Takaaki" <Takaaki.Ota@am.sony.com<mailto:Takaaki.Ota@am.sony.com>>, "Wu, Max" <Max.Wu@am.sony.com<mailto:Max.Wu@am.sony.com>>, "Nejat, Mike" <Mahyar.Nejat@am.sony.com<mailto:Mahyar.Nejat@am.sony.com>>, "Masham, Samuel (CPDG)" <SamuelA.Masham@jp.sony.com<mailto:SamuelA.Masham@jp.sony.com>>, "Day, Saneesh" <saneesh.s@ap.sony.com<mailto:saneesh.s@ap.sony.com>>
Subject: MPEG-2 TS Descriptors
Resent-From: "public-inbandtracks@w3.org<mailto:public-inbandtracks@w3.org>" <public-inbandtracks@w3.org<mailto:public-inbandtracks@w3.org>>
Resent-Date: Friday, May 23, 2014 at 10:48 AM


Dear Community group,



I would like to propose an alternative approach for exposing the track description information that does not rely on TextTrackCue.



The reasoning am following is this. In MPEG-2 TS the program map section and table carries a lot of information that is not relevant to the application. The pid is an example. Descriptors that the application can do nothing with is another example. The application should care only about information it can act on and the media pipeline shouldn't be overly burdened with maintaining PMT cues (where any change, no matter whether relevant will need to be tracked and then the whole block exposed as a cue)



Examples of information the application cares about:

1. Knowing what kind of AudioTracks there are.

2. Knowing there are captions that it can turn on or off.

3. Knowing that a TextTrack is a certain stream type containing private data that will be presented as cues.



Outside of describing the tracks the PMT is of limited use so why are we adding complexity to creating and maintaining cues (which was primarily designed for carefully timed actions) for this data?



Instead I propose that the TextTrack, AudioTrack and VideoTrack carries the relevant information about it and any associated cues.

I believe that audioTrack and videoTrack may be sufficiently covered by the existing lang and kind attributes with the mappings already defined by . If not then we can add the TrackDescription interface to this too.



However the TextTrack is not sufficiently descriptive since the only generic attribute is inBandMetadataTrackDispatchType<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-inbandmetadatatrackdispatchtype> and so I propose that we extend TextTrack as follows:



enum TextTrackMode { "disabled<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-disabled>",  "hidden<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-hidden>",  "showing<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-showing>" };

enum TextTrackKind { "subtitles<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-kind-subtitles>",  "captions<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-kind-captions>",  "descriptions<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-kind-descriptions>",  "chapters<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-kind-chapters>",  "metadata<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-kind-metadata>" };

interface TextTrack : EventTarget<http://www.w3.org/TR/html5/infrastructure.html#eventtarget> {

  readonly attribute TextTrackKind<http://www.w3.org/TR/html5/embedded-content-0.html#texttrackkind> kind<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-kind>;

  readonly attribute DOMString label<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-label>;

  readonly attribute DOMString language<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-language>;



  readonly attribute DOMString id<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-id>;

  readonly attribute DOMString inBandMetadataTrackDispatchType<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-inbandmetadatatrackdispatchtype>;



           attribute TrackDescription description;



           attribute TextTrackMode<http://www.w3.org/TR/html5/embedded-content-0.html#texttrackmode> mode<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-mode>;



  readonly attribute TextTrackCueList<http://www.w3.org/TR/html5/embedded-content-0.html#texttrackcuelist>? cues<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-cues>;

  readonly attribute TextTrackCueList<http://www.w3.org/TR/html5/embedded-content-0.html#texttrackcuelist>? activeCues<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-activecues>;



  void addCue<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-addcue>(TextTrackCue<http://www.w3.org/TR/html5/embedded-content-0.html#texttrackcue> cue);

  void removeCue<http://www.w3.org/TR/html5/embedded-content-0.html#dom-texttrack-removecue>(TextTrackCue<http://www.w3.org/TR/html5/embedded-content-0.html#texttrackcue> cue);



           attribute EventHandler<http://www.w3.org/TR/html5/webappapis.html#eventhandler> oncuechange<http://www.w3.org/TR/html5/embedded-content-0.html#handler-texttrack-oncuechange>;



}

\\



Interface TrackDescription {

  attribute octet kind;                   //In MPEG-2 this might be mapped to Stream_Type

  attribute ArrayBuffer    data;          //optional raw data (e.g. as contained in the MPEG-2 TS elementary stream descriptor)

  attribute DOMSTRING      text;          //Optional text version of the data.

}







>From this extension we can provide mappings of type for each kind of TextTrack we are interested in supporting in each kind of media format.



The Pros to this approach of the MPEG-2 TS Metadata Cue mapping proposed



1.       Elegant and much simpler that the proposed mapping to a cue

2.       Ties the description to the relevant track so removes need to consider the order or the PiD

3.       Expandable to other formats

4.       Less media processing requirements because no special cue handling

5.       Easier for the application developer to understand and make use of.





The Cons



1.       New HTML5 structure added on TextTrack





Regards



Graham Clift
Received on Friday, 23 May 2014 18:45:06 UTC