Issue 175: Simulcasting to RtpReceiver and switched stream rapid switches and clipping

Issues:
A single RtpReciever can be configured to receive a simulcast stream (i.e. receives from potentially multiple stream sources but one one is received and rendered at a time from the RtpReceiver which outputs a single Media Stream Track).

When an RtpReceiver is configured for simulcasting certain configurations can lead to a rapid switch from one RtpSender's stream to another and then back which can be confusing to a receiver on how to demux the RTP packets properly.

Scenario A:
1. RtpReceiver is set to receive a simulcasting stream for SSRC 5 and SSRC 6 (kind might be audio or video)
2. RtpReceiver starts receiving from SSRC 5 and renders the stream.
3. RtpReceiver starts receiving from SSRC 6 (and not SSRC 5 as a switch has occurred) and renders that stream.
4. RtpReceiver starts receiving SSRC 5 again after a short period of time (due to a switch back to SSRC 5).

Scenario B:
1. RtpReceiver is set to receive a simulcasting stream for SSRC 5 and SSRC 6 (kind might be audio or video)
2. RtpReceiver starts receiving from SSRC 5 and renders the stream.
3. RtpReceiver starts receiving from SSRC 6 (and not SSRC 5 as a switch has occurred) and renders that stream.
4. RtpReceiver starts receiving SSRC 5 again after a short period because some backlogged network packets arrived later from SSRC 5.

The problem:
The difference from scenario A and B are difficult to determine from an engine perspective. Are the packets arriving on the original SSRC 5 because of a switch back to 5 or because they were late to arrive? Also, if SSRC 5 had some late packets arriving and the switch to SSRC 6 occurred then the remaining SSRC 5 packets would get clipped (potentially cutting the end of an audio stream slightly short).

Possible solutions:
(a) Have a set of timing rules that can be applied to determine scenario A versus B to resolve the ambiguity.
(b) Render in two (or more) hidden RtpReceivers with individual tracks being output from each where the simulcast RtpReceiver is the rendered output of the combined audio for audio and the active video for video.
(c) Do not allow simulcasting and require separate RtpReceivers where the Media Stream Tracks indicate their activity (active/inactive) state allowing switching from an application between the streams (as well as sending all audio to render so that it doesn't matter which stream is output).
(d) Do a simple method of "last packet wins" and watch the jitter happen :)

Personally, I think (c) must always be an option to the programmer if they prefer to switch to active Media Stream Tracks manually in the application layer thus I don't see dropping support for simulcast in the RtpReceiver as advantageous, which eliminates (c) as the "solution" for me. I assume that Media Stream Track already has the ability to fire an "active / inactive" state to know when a stream is actively receiving or has become inactive [sending party has stopped transmitting] (but I've not verified this is actually true or not that such an event exists).

I don't like (d) because I think the user experience will be bad.

I think (b) is a lot of additional work for a marginal improvement in rendering where (c) could already be used if an application programmer cared about those marginal improvements.

That leaves (a) for me. The question is what set of rules / timings would need to be used?


-Robin

Received on Saturday, 7 February 2015 14:45:32 UTC