Re: Coordination between MediaStream and MediaSource

On Fri, Jun 6, 2014 at 1:38 PM, Ian Hickson <> wrote:

> Hi,
> The MediaStream interface is used to represent streams of media data,
> typically (but not necessarily) of audio and/or video content.
> The MediaSource object represents a source of media data for an
> HTMLMediaElement.
> While these sources have quite different properties -- one is populated by
> the user agent from a user-defined source, e.g. a local camera, a stream
> from a remote computer, or a static video file; the other is populated by
> the script itself, providing data directly in the form of binary data --
> they nonetheless share a lot in common:
>  - they are both sources for use by <video> elements
>  - they both provide APIs for the management of tracks, merging them into
>    a single object for presentation
> >From the HTML spec's perspective, my intent is to treat MediaStream and
> MediaSource objects identically, allowing both where either is allowed.

I don't think you should need to do anything significantly different for
these 2 objects. There just needs to be well defined hooks for where
different "sourcing object" can affect HTMLMediaElement state. In the case
of MSE, I just tried to specify which step in each algorithm I needed to
insert behavior.

In the case of MediaStream, it just locks down what various
HTMLMediaElement attribute are allowed. I basically think of it as
essentially turning the HTMLMediaElement into a read-only canvas with a
volume control. It just displays and plays the "current" contents of the
stream and doesn't allow seeking or anything other than play/pause.

> However, I would like to encourage the working groups to coordinate their
> efforts so that these APIs are intuitive to authors, even in situations
> where the author uses both. For example, it seems like it would make sense
> to allow an audio source from a local microphone to be merged with video
> data from an ArrayBufer, for output in a single <video> element. Or for
> WebRTC to take data generated from mixing ArrayBuffers and send it to a
> remote host as a stream.

Both situations would be possible if HTMLMediaElement was able to create a
MediaStream for its output.

                       local mic -> MediaStream --+ ---> HTMLMediaElement
                                                  ^ addTrack()
MediaSource -> HTMLMediaElement - > MediaStream --+

MediaSource -> HTMLMediaElement - > MediaStream -> WebRTC

I believe if the HTMLMediaElement simply created a MediaStream that
contains a MediaStreamTrack for video if videoTracks.length > 0 and create
MediaStreamTracks for all the enabled tracks in audioTracks.

I believe something like this would be the simplest way to bind the random
access capability of MediaSource and normal ".src=" media playback with the
inherently linear playback of MediaStreams.

> Please let me know if there's anything I can do to help with this.

It would be nice if there were better hooks for algorithms that may need to
hook into the HTMLMediaElement behavior. You can see several workarounds in
the MediaSource algorithms
 and HTMLMediaElement Extensions
where I have to hook into various HTMLMediaElement behaviors or trigger
logic based on specific state transitions. I don't have specific
suggestions at the moment, but it is a point of brittleness in the MSE spec
because I am referencing steps by number or text copied out of the spec.


> Cheers,
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
>       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Friday, 6 June 2014 22:03:16 UTC