Re: HTML5 support for track metadata from Silvia Pfeiffer on 2014-03-10 (public-inbandtracks@w3.org from March 2014)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Mon, 10 Mar 2014 16:50:16 +1100
To: Bob Lund <B.Lund@cablelabs.com>
Cc: "Clift, Graham" <Graham.Clift@am.sony.com>, "public-inbandtracks@w3.org" <public-inbandtracks@w3.org>, "Ota, Takaaki" <Takaaki.Ota@am.sony.com>, "Wu, Max" <Max.Wu@am.sony.com>, "Nejat, Mike" <Mahyar.Nejat@am.sony.com>, "Candelore, Brant" <Brant.Candelore@am.sony.com>
Message-ID: <CAHp8n2nXr1VLRRDnyv+Zm8O93BTDZT7Vzh18ZDL8OJD6XQEpwg@mail.gmail.com>
Hi Bob & Clift,

I have actually wondered about the need for exposing in-band track
metadata here myself
https://www.w3.org/community/inbandtracks/wiki/Main_Page#Exposing_In-band_Track_Metadata
.

I believe the biggest discussion that was had here wrt this topic is
to expose the PMT. I agree with Clift and Bob that exposing the PMT in
a separate track is not necessary, since the list of tracks in the
media element (.audioTracks, .videoTracks and .textTracks) as well as
the attributes on the track objects provide sufficient information
about the PMT.

Thus, we have to discuss whether there is a use case for providing a
generic means for exposing other in-band metadata text tracks as
specified here:
https://www.w3.org/community/inbandtracks/wiki/Main_Page#Guidelines_for_creating_metadata_text_track_cues

What other tracks does a MPEG-2 TS have that are not audio, video,
captions/subtitles, or the PMT?

Cheers,
Silvia.




On Wed, Mar 5, 2014 at 9:11 AM, Bob Lund <B.Lund@cablelabs.com> wrote:
>
>
> From: <Clift>, Graham <Graham.Clift@am.sony.com>
> Date: Tuesday, March 4, 2014 at 2:17 PM
> To: Bob Lund <b.lund@cablelabs.com>, "public-inbandtracks@w3.org"
> <public-inbandtracks@w3.org>
> Cc: "Ota, Takaaki" <Takaaki.Ota@am.sony.com>, "Wu, Max"
> <Max.Wu@am.sony.com>, "Nejat, Mike" <Mahyar.Nejat@am.sony.com>, Brant
> Candelore <brant.candelore@am.sony.com>
> Subject: RE: HTML5 support for track metadata
>
> Hi CG,
>
>
>
> I believe that option 3 is the best approach because it requires least
> change to HTML5 specification, covers the use cases and will be the most
> likely to pass the HTML WG.
>
>
>
> As to how to handle events (with minimal change to HTML spec), I was
> thinking the following:
>
>
>
> When an application sees a change in the PMT there should also be a
> corresponding change to one or more of the
> TextTrackList/VideoTrackList/AudioTrackList 's.
>
> Seems then it should be sensible to tie the PMT change in
> inBandMetaDataDispatchType to an onchange event in the TextTrackList object.
> This would be sufficient for the purpose tracking PMT data and if cues are
> needed then they could be generated easily by the application thus
> eliminating the only advantage that option 1 has IMHO.
>
>
> Changes in the PMT caused by adding/removing elementary streams should
> result in one of onhange, onddrack, onemovetrack events. Are there any use
> cases where the PMT for an existing track changes?
>
>
>
> BTW,
>
>
>
> As well as deciding how to handle the PMT data via the three alternatives I
> believe more work should be done on Item 3) (creating in-band metadata text
> tracks  cues). In particular there needs to be more clarification to ensure
> consistency across implementations. The two areas below is where I see some
> problems:
>
>
>
> a)       Handling private section payloads that are split across many TS
> packets is not well defined.
>
>
> My proposal defines this - a complete private section is returned in the
> cue.
>
> Seems there are three possible approaches.
>
>                      i.                              Since the detection
> method proposed is the 'payload_unit_start_indicator', this would suggest
> that the transport demux is collecting the individual payloads before
> presenting them as a cue to the web application. If this is the case then
> how does the demux decide that the payload is complete? If it waits to see
> the next 'payload_unit_start_indicator' then the timing may be too late to
> be relevant.
>
>                    ii.                              If, on the other hand,
> the demux creates a cue for each payload entity then that could impact
> performance.
>
>                   iii.                              Maybe it just waits for
> a period of time and if no more payloads packets are received then cue what
> we have. This approach would mean there is an expectation for the
> application to handle variably fragmented cues.
>
>
>
> Leaving it up the UA to decide how to implement could be challenging for web
> application designers to allow for all possiblities.
>
> b)      There is no clarity of what the startTime means to the application.
> The spec says this is with respect to the media resource time which
> presumably means the PTS from some audio/video PES. But which timing? After
> the private payload is sent or before private payload is sent? And what is
> the startTime for payloads split across TS packets with Video PES packets in
> between, especially when partial payload DataCues are supported as  (as in
> case iii above.) The option of leaving this up to the implementation is
> unsatisfactory because of the potential for variations in the result.
>
>
> I agree more guidelines in cue generation would remove ambiguity. I think
> startTime in the case of MPEG-2 TS metadata should be the current time in
> the media resource when the private section is received.
>
>
>
> Regards
>
>
>
> Graham Clift
>
>
>
>
>
> From: Bob Lund [mailto:B.Lund@CableLabs.com]
> Sent: Tuesday, March 04, 2014 11:50 AM
> To: public-inbandtracks@w3.org
> Subject: HTML5 support for track metadata
>
>
>
> Hi CG,
>
>
>
> I think that the existing HTML5 CR spec [1] is very close to supporting the
> use cases that are being discussed in the CG. I propose submitting several
> HTML5 bugs to close the gap and I'm interested in your thoughts.
>
>
>
> An alternative described in the CG wiki is to use kind and
> inBandMetadataTrackDispatchType attributes to expose track metadata: kind is
> used for all tracks except metadata text tracks and
> inBandMetadataTrackDispatchType is used for metadata text tracks. This
> covers the use cases that have been described in the CG but requires some
> additions to existing HTML5 CR sections:
>
>
>
> 1) Additions to the table "Return values for AudioTrack.kind() and
> VideoTrack.kind()" in [2] describing how @kind should be set for various
> track types. [3] shows the new additions for MPEG-2 TS and DASH media
> resources.
>
>
>
> 2) Text track equivalent of the table "Return values for AudioTrack.kind()
> and VideoTrack.kind()" in [2]. [4] shows such an equivalent table for
> setting @kind for text tracks in MPEG-2 TS and DASH. This could go in the
> HTML5 CR spec here [5].
>
>
>
> 3) Guidelines for creating in-band metadata text track cues. Here is the
> start for MPEG-2 TS [5]. This table could go here [6].
>
>
>
> 4) Additional definition for DASH describing how to set
> inBandMetadataTrackDispatchType[7].
>
>
>
> Does anyone see a reason not to file bugs to add 1-4 above? These changes
> are consistent with the direction already taken in [1]. Making these changes
> wouldn't preclude further work in the CG and would address use cases that
> have been identified so far.
>
>
>
> Bob
>
>
>
> [1] http://www.w3.org/TR/html5/
>
> [2]
> http://www.w3.org/TR/html5/embedded-content-0.html#audiotracklist-and-videotracklist-objects
>
> [3]
> https://www.w3.org/community/inbandtracks/wiki/Main_Page#Audio_and_video_kind_table
>
> [4] https://www.w3.org/community/inbandtracks/wiki/Main_Page#Text_kind_table
>
> [5]
> https://www.w3.org/community/inbandtracks/wiki/Main_Page#Guidelines_for_creating_metadata_text_track_cues
>
> [6]
> http://www.w3.org/TR/html5/embedded-content-0.html#sourcing-in-band-text-tracks
>
> [7]
> https://www.w3.org/community/inbandtracks/wiki/Main_Page#Exposing_a_Media_Resource_Specific_TextTrack
>
>
Received on Monday, 10 March 2014 05:51:04 UTC