Re: Tech Discussions on the Multitrack Media (issue-152)

Hi Mark,

On Fri, Feb 11, 2011 at 12:47 AM, Mark Watson <watsonm@netflix.com> wrote:
> Hi everyone,
> I have a couple of comments on this proposal, but first, since this is my
> first post to this list I should introduce myself. I am representing Netflix
> - we joined W3C just this week. We are interested in ensuring that a
> streaming service like ours could in future be supported by HTML5.
>
> One thing we are interested in is support for multiple languages, for both
> audio and text tracks and therefore a Javascript API to discover and select
> amongst those tracks.
>
> We use a form of HTTP adaptive streaming, which if translated to HTML5 would
> mean providing a URL for a manifest to the <video> element as in Option (1)
> on the wiki page. But there is also another case where there are multiple
> tracks and no HTML markup: when there are multiple tracks inside a single
> ordinary media file e.g. an mp4 file with multiple audio language tracks.

We have called the case where there are multiple tracks inside a
single resource the "in-band" case and indeed the very first side
condition that is listed states "we want to achieve a consistent API
between in-band and external audio/video tracks". This is very much a
goal of the Multitrack API.


> The distinction between TextTrack and MediaTrack in the API under option (1)
> seems strange to me. Text is just another kind of media, so shouldn't the
> kind for each track be ( Audio | Video | Text ) rather than ( Media | Text )
> where Media = ( Audio | Video ) ? [This is how it is framed in option 3,
> albeit up one level].

That is indeed one of the key questions we will have to answer: do we
treat media tracks differently from text tracks. One the one hand they
can all abstractly be regarded as "some time-aligned data for a
video". On the other hand, the current specification of <track> does
not allow child elements and the TimedTrack API is very specifically
targeted at text tracks: the concept of text cues does not make sense
for media tracks. Therefore it was necessary to introduce MediaTrack
as an additional API. But indeed one of the questions we need to
answer is whether that is a good idea or not.


> I don't have a strong opinion on the markup aspect, but I think the first
> side-condition is important (that the API be the same whether the tracks
> come from explicit markup or are present within a multiplexed file or
> described by a manifest).

Yes, that condition is important.

> If I read rightly this condition is not met in
> (3), (4) and (6) right ?

That's not quite right. The JavaScript API is still the same for both
in-band and external media tracks in most of these cases.

In case (3) we have AudioTrack and VideoTrack, which apply equally to
in-band and external media tracks.

You are correct for case (4) though: the synchronization entity moves
from the main video to the par above it, such that in-band tracks
cannot be supported like this any longer, since in-band tracks are not
associated with a parent element. I agree that (4) is not an
acceptable solution to this problem. It exists more for historic
purposes, since the question of re-using SMIL constructs keeps coming
up and the markup seemed to make sense. But the JavaScript API for
this just cannot be made consistent.

Approach (6) however would again work the same for in-band and
external media. As long as we include the gathering of the dependent
media elements into browser code, we can also add to this set the
in-band tracks and thus make all tracks uniformly available through
something like video.audioTracks and video.videoTracks.

Cheers,
Silvia.

Received on Thursday, 10 February 2011 23:09:26 UTC