Re: Rationalizing new/start/end/mute/unmute/enabled/disabled from Harald Alvestrand on 2013-04-09 (public-media-capture@w3.org from April 2013)

From: Harald Alvestrand <harald@alvestrand.no>
Date: Tue, 09 Apr 2013 10:02:44 +0200
To: public-media-capture@w3.org
Message-ID: <5163CB24.6040805@alvestrand.no>
On 04/09/2013 12:16 AM, Robert O'Callahan wrote:
> On Tue, Apr 9, 2013 at 12:43 AM, Stefan Håkansson LK 
> <stefan.lk.hakansson@ericsson.com 
> <mailto:stefan.lk.hakansson@ericsson.com>> wrote:
>
>
>         All tracks that we can decode. So e.g. if you play a resource
>         with a
>         video track in an <audio> element and capture that to a
>         MediaStream, the
>         MediaStream contains the video track.
>
>
>     What if there are two video tracks? Only one of them is
>     selected/played naturally, but in principle both could be decoded.
>     (What I am saying is that we need to spec this up).
>
>
> Definitely. Yes, I think we should decode them both.
Not sure I get where this is coming from....

I see absolutely no reason to decode a video stream until we know where 
it's going.
The destination might be another PeerConnection with a compatibly 
negotiated codec, or a hardware device with special codec support, or a 
Recorder willing to store the bytes as previoiusly encoded.

I think we should carefully *avoid* specifying exactly where and when 
decoding takes place. Only the observable result should be in the standard.

>
>
>
>             In principle I agree, being able to switch source of a
>             MediaStream(Track) would be a natural to have (and needed for
>             certain legacy interop cases).
>
>
>         We may not need to "switch the source of a MediaStreamTrack".
>         There are
>         a few ways to expose API to effective switch audio sources.
>         One approach
>         would be to create a MediaStreamTrack from the output of a Web
>         Audio
>         AudioNode. Then Web Audio can be used to switch from one audio
>         source to
>         another. Web Audio already specs this:
>         https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#MediaStreamAudioDestinationNode
>         although no-one's implemented it yet AFAIK. It would be easy
>         for us to
>         implement.
>
>
>     That's right, I did not think about that possibility. What about
>     video?
>
>
> There is no comparable API for video on a standards track, but there 
> should be.
>
> MediaStream Processing defined a ProcessedMediaStream which would take 
> a set of incoming MediaStreams or MediaStreamTracks, mix the audio 
> tracks together with script-defined processing, composite the video 
> tracks together using a fixed compositing model, and give you the 
> output as a single audio and/or video track. It also offered 
> fine-grained scheduling of when inputs would be added/removed to the 
> compositing mix, and had the ability to pause incoming streams, and do 
> some timestamp-based synchronization. I think we should bring back 
> something like that. We can drop the scripted audio processing since 
> Web Audio covers that now.
>
> A simple initial stab at that API would be to define 
> ProcessedMediaStream as a subclass of MediaStream which takes an 
> additional dictionary argument for the track-array constructor:
>   Constructor (MediaStreamTrackArray tracks, 
> ProcessedMediaStreamConfiguration config)
> where ProcessedMediaStreamConfiguration would specify which kinds of 
> tracks should appear in the output, e.g. { video: true, audio: true }. 
> The audio track (if any) would be defined to be the mix of all zero or 
> more input audio tracks. The video track (if any) would be defined to 
> be the composition of zero or more input video tracks (defined to 
> stretch all video frames to the size of the largest video frame, or 
> something like that), in a defined order (e.g. the first track added 
> to the stream is at the bottom). Since most video tracks don't have an 
> alpha channel, that means the last video track added wins. (But we 
> should add the ability to make a VideoStreamTrack from an HTML canvas 
> so we can have real-time compositing of overlays onto video.)

I don't want to go there at this time.

We seriously run the risk of defining a toolkit that is able to do a set 
of tasks that nobody wants done (because they're not satisfied with the 
result), placing a heavy burden on browser implementors for very 
marginal benefit.

We already support putting a video on a Canvas (in a somewhat cumbersome 
way, true). Should we focus on getting the video back out of the Canvas, 
and let people play with that toolkit before we start in on defining 
smaller toolkits that might not be able to do what people really want?
Received on Tuesday, 9 April 2013 08:03:14 UTC