Re: [media] progress on multitrack api - issue-152

Thanks for compiling this, Silvia, I'll do my best to point out where I  
disagree and why.

On Sun, 17 Apr 2011 16:05:13 +0200, Silvia Pfeiffer  
<silviapfeiffer1@gmail.com> wrote:

> Hi all,
>
> In the last media subgroup meeting we further discussed the different
> change proposals that we have for issue-152.
>
> A summary of all the submitted change proposals is at
> http://www.w3.org/WAI/PF/HTML/wiki/Media_Multitrack_Change_Proposals_Summary
> .
>
> We discussed that Proposal 4 can, with a few changes, provide for the
> requirements of in-band and externally composed multitrack resources.
> Proposal 4 introduces an interface for in-band audio and video tracks,
> and a Controller object to maintain the shared state between the
> individual media elements that together make up a composed multitrack
> resource.
>
> This email serves two purposes:
>
> Firstly it asks others on the accessibility task force whether there
> are any objections to going with proposal 4 (Philip?, Geoff?). The
> people present at the meeting agreed that they would be prepared to
> withdraw their change proposals in favor of proposal 4. This include
> all proposals numbered 1 to 3 on the summary page.

I don't have any strong objections to what is already in the WHATWG spec,  
but then I haven't made a detailed review of implementability since the  
last round of changes.

> Secondly it summarizes the remaining issues that we would like
> addressed for proposal 4.
>
> The remaining issues are:
>
> (1) videoTracks should be MultipleTrackList, too:
>
> The current HTMLMediaElement has the following IDL to expose in-band
> media tracks:
>   readonly attribute MultipleTrackList audioTracks;
>   readonly attribute ExclusiveTrackList videoTracks;
>
> The objection is to the use of ExclusiveTrackList on videoTracks. It
> should be allowed to have multiple in-band video tracks activated at
> the same time. In particular it seems that MP4 files have a means of
> specifying how multiple video tracks should be displayed on screen and
> Safari is already able to display such.
>
> In contrast, proposal 4 requires that only one in-band video track can
> be active and displayed into the video viewport at one time. If more
> than one video track is to be displayed, it needs to be specified with
> a media fragment URI in a separate video element and connected through
> a controller.
>
> Some questions here are: what do other browsers want to do with
> multiple in-band video tracks? Does it make sense to restrict the
> display to a single video track? Or should it be left to the browser
> what to do - in which case a MultipleTrackList approach to videoTracks
> would be sensible? If MultipleTrackList is sensible for audio and
> video, maybe it could further be harmonized with TextTrack.

I think that videoTracks should be an ExclusiveTrackList, to avoid having  
to invent a layout system for displaying multiple video streams in a  
single <video>. Even if WebM supported positioning video tracks similar to  
what MPEG-4 presumably can, this layout will likely not be good enough in  
most cases anyway, as you can't do fancy borders and likely couldn't do  
things like move the videos around by mouse dragging in JavaScript.

> (2) interface on TrackList:
>
> The current interface of TrackList is:
>   readonly attribute unsigned long length;
>   DOMString getName(in unsigned long index);
>   DOMString getLanguage(in unsigned long index);
>            attribute Function onchange;
>
> The proposal is that in addition to exposing name and language
> attributes - in analogy to TextTrack it should also expose a label and
> a kind.
>
> The label is necessary to include the track into menus for track
> activation/deactivation.
> The kind is necessary to classify the track correctly in menus, e.g.
> as sign language, audio description, or even a transparent caption
> track.

Maybe the spec changed since you wrote this, because currently it has  
getLabel and getLanguage.

What would the kind reflect? There's no attribute in the DOM for  
out-of-band tracks for this currently.

Apart from this, I think that TrackList really should be a list of Track  
objects or similar, so that getName/getLanguage are pushed onto that  
object as .name and .language, just like on TextTrack.

> (3) looping should be possible on combined multitrack:
>
> In proposal 4 the loop attribute on individual media elements is
> disabled on multitrack created through a controller, because it is not
> clear what looping means for the individual element.
>
> However, looping on a multitrack resource with in-band tracks is well
> defined and goes over the complete resource.
>
> In analogy, it makes sense to interpret loop on a combined multitrack
> resource in the same way. Thus, the controller should also have a
> muted attribute which is activated when a single loop attribute on a
> slave media element is activated and the effect should be to loop over
> the combined resource, i.e. when the duration of the controller is
> reached, all slave media elements' currentTime-s are reset to
> initialPlaybackPosition.

I'd strongly prefer not to have looping multitrack unless there are strong  
use cases. It would be easy to add back in the future should we find that  
not implementing it was a mistake.

> (4) autoplay should be possible on combined multitrack:
>
> Similar to looping, autoplay could also be defined on a combined
> multitrack resource as the union of all the autoplay settings of all
> the slaves: if one of them is on autoplay, the whole combined resource
> is.

I have no strong opinion, but we should have consistency such that  
changing the paused attribute (e.g. by calling play()) has the exact same  
effect. It's not clear to me what the spec thinks should happen when  
play() is called on a media element with a controller.

> (5) more events should be available for combined multitrack:
>
> The following events should be available in the controller:
>
> * onloadedmetadata: is raised when all slave media elements have
> reached at minimum a readyState of HAVE_METADATA
>
> * onloadeddata: is raised when all slave media elements have reached
> at minimum a readyState of HAVE_CURRENT_DATA
>
> * canplaythrough: is raised when all slave media elements have reached
> at minimum a readyState of HAVE_FUTURE_DATA

This should be HAVE_ENOUGH_DATA. (HAVE_FUTURE_DATA is a useless state that  
indicates that two frames (including the current frame) are available. The  
associated event, canplay, doesn't mean that one can actually play in any  
meaningful sense.)

> * onended: is raised when all  slave media elements are in ended state
>
> or said differently: these events are raised when the last slave in a
> group reaches that state.
>
> These are convenience events that will for example help write combined
> transport bars. It is easier to attach just a single event handler to
> the controller than to attach one to each individual slave and make
> sure they all fire. Also, they help to maintain the logic of when a
> combined resource is loaded. Since these are very commonly used
> events, their introduction makes sense.
>
> Alternatively or in addition, readyState could be added to the  
> controller.

I would like to see more discussion of the actual use cases. When would it  
be useful to know that all slave media elements have reached  
HAVE_CURRENT_DATA, for example? For HAVE_ENOUGH_DATA, the main use case  
seems like it would be to autoplay, but you also suggested adding an  
explicit autoplay.

My reluctance comes from not-so-happy experience with readyState and the  
related events on HTMLMediaElement. Most of the states and events are  
borderline useless and it's also necessary to lie about them in order to  
avoid exposing race conditions to scripts. Without very compelling use  
cases, I'd prefer pushing the burden over on scripts, until we see what  
it's going to be used for.

> (6) controls on slaves control the combined multitrack:
>
> Proposal 4 does not provide any information on what happens with media
> elements when the @controls attribute is specified. Do the controls
> stay in sync with the controls of the other elements? Do they in fact
> represent combined state? Do they represent the state of the slave
> resource? What happens when the user interacts with them? Is the
> information on the interaction - in particular seeking, muting, volume
> change, play/pause change, rate change - handed on to the controller
> and do the others follow?

Right, this isn't clear to me either.

-- 
Philip Jägenstedt
Core Developer
Opera Software

Received on Monday, 18 April 2011 09:00:04 UTC