- From: Bernard Aboba <Bernard.Aboba@microsoft.com>
- Date: Thu, 20 Aug 2015 14:54:01 +0000
- To: Harald Alvestrand <harald@alvestrand.no>
- CC: "public-webrtc@w3.org" <public-webrtc@w3.org>
- Message-ID: <C6B96C46-A408-4B60-9FCD-633DD2B0C660@microsoft.com>
In terms of media, this approach produces simulcast bitstreams differentiated by PT, using the existing API surface. Not heretical, because the same observation has been made on the MMUSIC WG list. The only missing aspect for on-the-wire signaling is to indicate that the streams are simulcasts of the same source as opposed to being different sources - but that is not an API problem. On Aug 20, 2015, at 03:08, Harald Alvestrand <harald@alvestrand.no<mailto:harald@alvestrand.no>> wrote: Perhaps-heretical thought: Do we need this exposed at the API surface at all? When an MCU makes an offer to the browser, the browser can look at the offer and at its own capabilities and just respond with the maximum number of streams it's willing to support. When the browser makes an offer (which may or may not go to the MCU), would it really hurt anything (apart from wasting a few PT numbers) if the browser included as many "send-only imageattr" lines as it was willing to support? This way, we could get the effects of Adam's proposal without changing a single bit in the API surface, which means that we don't have to reach any agreement on whether it's sufficient or not. As a minimum change to the API surface, we can't get smaller than no change..... On 08/14/2015 01:03 AM, Adam Roach wrote: I've been involved in a number of recent conversations around simulcast for WebRTC, and a several implementors have indicated that it's an important feature for the initial release of WebRTC. As I understand the state of play: * Chrome has a form of simulcasting implemented using undocumented SDP mangling * Firefox has no simulcasting implemented, but will soon * The WebRTC 1.0 API has no simulcast-related controls whatsoever * The IETF MMUSIC working group is nearing completion on a document (draft-ietf-mmusic-sdp-simulcast-01) that allows negotiation of simulcast in SDP I also understand and sympathize with the goal to stop adding any non-trivial modifications to the existing WebRTC spec, so that we can finally publish an initial version of the document. At the same time, the vast majority of the use cases that make sense for simulcast involve browsers talking to an MCU (or similar server), sending multiple encodings per track in the browser-to-MCU direction, but receiving only one encoding per track in the MCU-to-browser direction. This is interesting, because it means that we don't really require any controls that indicate the desire for a browser to receive simulcast -- all we need is the ability to indicate a willingness to send it. At the same time, the MCU will know what resolutions (and other variations) it wants to receive, and can inform the browser of this information via SDP. Based on the foregoing, then, I propose that we instead add a trivial control to the existing RTCRtpSender objects. My strawman proposal would be something like: ________________________________ partial interface RTCRtpSender { attribute unsigned short maxSimulcastCount; }; maxSimulcastCount of type unsigned short This attribute controls the number of simulcast streams that will be offered for the specific RTCRtpSender. The actual number of streams used for this sender will depend on the answer that is passed to setRemoteDescription. ________________________________ Here's how that would work (I'm going to use simulcast with two encodings for my examples, but extrapolating use for more streams than that should be obvious). If the browser is the entity creating the offer, the script driving its side of stuff would (for any streams it wants to support simulcast) set: rtpSender.maxSimulcastCount = 2; The SDP that it gets from a subsequent createOffer would include two simulcast PTs. Both would have identical imageattrs, indicating the range of encodings supported for simulcast. Only one would be supported for recv (this is just the resulting m-line): m=video 49300 RTP/AVP 97 98 a=rtpmap:97 H264/90000 a=rtpmap:98 H264/90000 a=fmtp:97 profile-level-id=42c01f; max-fs=3600; max-mbps=108000 a=fmtp:98 profile-level-id=42c00b; max-fs=3600; max-mbps=108000 a=imageattr:97 send [x=[128:16:1280],y=[72:9:720]] recv [x=[128:16:1280],y=[72:9:720]] a=imageattr:98 send [x=[128:16:1280],y=[72:9:720]] a=simulcast send 97;98 recv 97 The MCU would then communicate actual desired resolutions using imagattr "recv" in its answer: m=video 49674 RTP/AVP 97 98 a=rtpmap:97 H264/90000 a=rtpmap:98 H264/90000 a=fmtp:97 profile-level-id=42c01f; max-fs=3600; max-mbps=108000 a=fmtp:98 profile-level-id=42c00b; max-fs=240; max-mbps=3600 a=imageattr:97 send [x=[320:16:1280],y=[180:9:720]] recv [x=1280,y=720] a=imageattr:98 recv [x=320,y=180] a=simulcast recv 97;98 send 97 ________________________________ Conversely, if the MCU were creating the offer, it would include the simulcast resolutions in the offer: m=video 49674 RTP/AVP 97 98 a=rtpmap:97 H264/90000 a=rtpmap:98 H264/90000 a=fmtp:97 profile-level-id=42c01f; max-fs=3600; max-mbps=108000 a=fmtp:98 profile-level-id=42c00b; max-fs=240; max-mbps=3600 a=imageattr:97 send [x=[320:16:1280],y=[180:9:720]] recv [x=1280,y=720] a=imageattr:98 recv [x=320,y=180] a=simulcast recv 97;98 send 97 When the receiving JavaScript calls setRemoteDescription, the maxSimulcastCount on the corresponding sender(s) would be automatically updated according to the number of encodings indicated for each video m-line. And, of course, the answer created by createAnswer would similarly contain simulcast information matching the number of desired encodings from the offer: m=video 49300 RTP/AVP 97 98 a=rtpmap:97 H264/90000 a=rtpmap:98 H264/90000 a=fmtp:97 profile-level-id=42c01f; max-fs=3600; max-mbps=108000 a=fmtp:98 profile-level-id=42c00b; max-fs=3600; max-mbps=108000 a=imageattr:97 send [x=1280,y=720] recv [x=[320:16:1280],y=[180:9:720]] a=imageattr:98 send [x=320,y=180] a=simulcast send 97;98 recv 97 ________________________________ I think this satisfies a broad range of simulcast use cases with very little impact on the 1.0 API. I'll also note that this is intended to be a first-pass of simulcast implementation; if we find that other use cases arise that would benefit from more granular controls, we could easily add them in post-1.0 systems in a way that I believe could easily be backwards compatible with the scheme I describe above. -- Adam Roach Principal Platform Engineer abr@mozilla.com<mailto:abr@mozilla.com> +1 650 903 0800 x863
Received on Thursday, 20 August 2015 14:54:37 UTC