- From: Erik Lagerway <erik@hookflash.com>
- Date: Thu, 8 May 2014 09:34:57 -0700
- To: Emil Ivov <emcho@jitsi.org>
- Cc: Bernard Aboba <Bernard.Aboba@microsoft.com>, Robin Raymond <robin@hookflash.com>, "public-ortc@w3.org" <public-ortc@w3.org>
- Message-ID: <CAPF_GTb6ftVSGj=hHdng6C5H-zqinDV=2FRog10MDswgH-wzTA@mail.gmail.com>
+1 *Erik Lagerway <http://ca.linkedin.com/in/lagerway> | *Hookflash<http://hookflash.com/>* | 1 (855) Hookflash ext. 2 | Twitter <http://twitter.com/elagerway> | WebRTC.is Blog <http://webrtc.is/> * On Thu, May 8, 2014 at 9:30 AM, Emil Ivov <emcho@jitsi.org> wrote: > > > On 08.05.14, 17:10, Bernard Aboba wrote: > >> This proposal looks like it represents a big upgrade in SVC and >> simulcast support: >> >> 1. Via the scale, frameRate, and quality attributes in >> RTCRtpParameterVideoDetails (roughly corresponding to the previous >> RTCRtpEcodingParameters), it appears possible to support all modes of >> SVC, whereas only spatial scalability was supported previously. >> >> 2. Via RTCRtpParameterSimulcastDetails, it looks like combinations of >> SVC and simulcast (e.g. Temporal scalability + spatial simulcast) can be >> supported, which wasn't possible previously. >> >> 3. Via RTCRtpScalabilityType it appears possible to support combined SVC >> modes (e.g. Temporal + Spatial + Quality). >> > > Agree with all of the above! > > I like this new proposal gives layering and simulcast their own distinct > knobs. Currently all this lives in the same place so it's kind of confusing > and would probably be tricky to implement in a reasonable way. > > So, while this new proposal looks more detailed and possibly a bit more > complex I think it would actually be easier to implement and use in a > predictable way. > > Emil > > >> >> On May 7, 2014, at 9:13 PM, "Robin Raymond" <robin@hookflash.com >> <mailto:robin@hookflash.com>> wrote: >> >> >>> I am contributing a proposal on how to resolve an issue discovered in >>> the usage of "parameters". While the details can always be tweaked, I >>> think it successfully resolves much of the concern around the level >>> and knowledge required to configure a "parameters" object for anything >>> other than the basic use cases. >>> >>> In response to this posting: >>> http://lists.w3.org/Archives/Public/public-ortc/2014May/0007.html >>> >>> This also addresses the issue of exchanging detailed parameters over >>> the wire and instead base parameters based on capabilities. >>> >>> I am going to copy the entire proposal below to official contribute >>> the proposal but for the sake of readability I am also including a >>> link to the google doc(s). >>> >>> Proposal-ORTC Sender / Receiver Capabilities Based Model >>> https://docs.google.com/document/d/1htyRaNjXTE_ >>> O1GhD8TcLCNXFvVsgszpE8Lqgp3OCHlU/edit?usp=sharing >>> >>> Proposal-ORTC Sender / Receiver Use Case [Usage Comparison Analysis] >>> https://docs.google.com/document/d/1hdhCHj-gpwv06vIbAftxMG3oZtz7A- >>> nuYsuwQEkTat4/edit?usp=sharing >>> >>> >>> >>> >>> Proposal-ORTC Sender / Receiver Capabilities Based Model >>> * >>> >>> >>> Introduction >>> >>> >>> After attempting to write out some use cases using the existing >>> RTCRtpSender and RTCRtpReciever objects and parameters for ORTC, some >>> issues were discovered. Specifically, the application developer would >>> need to have a fair amount of knowledge on exactly how to tweak low >>> level parameters for anything beyond very simple use cases. For >>> example, setting up an SVC (Scalable Video Codec) would have required >>> knowing about what codecs support SVC, how the layering is setup for >>> particular codecs, and finally setting up specific geometric (or >>> temporal) attributes and layering relationship details by an >>> application developer. >>> >>> >>> As a result of the lack of easily configuration of RTP features, the >>> idea came out to give the application developer "preferences" where >>> the developer could choose what they want desire with high level knobs >>> and dials and let the engine (which has explicit knowledge of each >>> codec) configure the low level "parameters" details according to a >>> developer's wishes. The engine could then return the closest set of >>> preferences that could be achieved given the capabilities of the >>> engine and the developer can then choose to proceed or not setting up >>> media flows using these preferences and constructed parameters. >>> >>> >>> Another important discovery was made in the process of defining >>> "preferences". If two ORTC engines were given the same set of >>> preferences and the capabilities of both sender and receiver, each >>> engine could be made to construct "compatible" sender and receiver >>> "parameters" details without ever exchanging the parameter details >>> over the wire. This small realization about generating "parameters" >>> from capabilities for local consumption by an engine has a huge >>> impact. This generation removes the need for an engine to understand >>> and filter settings that it may not understand created by another >>> engine of unknown origin, which may use proprietary and/or custom >>> settings. A simple "ignore capabilities you don't understand" rule >>> could replace complex and cumbersome rules that would be otherwise >>> required if "parameters" were to be sent over the wire and later >>> filtered using a set of capabilities. >>> >>> >>> Parameters can be generated based on the union of sender and receiver >>> capabilities along with application developer preferences being used >>> as a guideline on how to create the parameters. The engine will do >>> it’s best to fulfill the preferences and it will return the parameters >>> that are possible given the union of the capabilities. >>> >>> >>> Two different engines must be able to compute compatible parameters >>> given all the same preferences and capabilities. Fortunately, any two >>> engines that understand the same capabilities can easily follow the >>> same rules to generate compatible parameters. While the parameters >>> created on the sender and receiver are required to be "compatible", >>> they need not be identical. The application developer should call >>> "createParameters(...)" on sender to create parameters suitable for >>> the sender. The application developer should call >>> "createParameters(...)" on the receiver to create params suitable for >>> a receiver. The calculated “parameters” for both sender and receiver >>> have to be compatible only to the extent that whatever a sender >>> produces a receiver must be capable of decoding. >>> >>> >>> The application developer has the option to tweak the detailed >>> parameters output by "createParameters(...)" but should only do so >>> with extreme caution. The resultant parameters output by >>> "createParameters(...)" are only meant for local consumption by the >>> local sender / receiver “start” methods. Sending these created >>> parameters over the wire is discouraged because implementations may >>> produce objects which may not be entirely understable by the remote >>> party, even though the media sent on the wire will be compatible. >>> >>> >>> Differences from Current Sender / Receiver API >>> >>> >>> Both models and APIs are more similar than they are different. The >>> subtle differences make important behavioural usage implications. >>> >>> >>> Both models send and receive based upon "parameter" settings. The >>> difference is in how the "parameters" are generated. The new model >>> generates the "parameters" based on an exchange of capabilities and >>> the application developer is given convenient 'knobs' called >>> "preferences" to perform most common use cases. The "parameters" in >>> the new model are intended for local consumption only and the >>> application developer is not required (and actively discouraged) from >>> marshalling these "parameters" over the wire. The new model proposes >>> marshaling and exchanging "capabilities" and optionally "preferences" >>> and then generating compatible "parameters" based on those exchanges. >>> >>> >>> In both models, the application developer may choose to tweak low >>> level parameters should specific compatibilities be required. But the >>> "preferences" model allows most application developers to completely >>> ignore the low level parameters. >>> >>> >>> Advantages of the New Capabilities Model >>> >>> >>> Overall the proposed capabilities based API has strong advantages. >>> Main advantages are: >>> >>> 1. >>> >>> >>> Simplicity in setup based on "preferences" for the application >>> developer >>> >>> 2. >>> >>> >>> Less brittle designs/implementations since low level parameters >>> are not exchanged, filtered, and interpreted by different browser >>> engines >>> >>> 3. >>> >>> >>> Much less knowledge (and often no pre-knowledge) is required for >>> the application developer to take full advantage of a browser's >>> capabilities >>> >>> >>> RTCRtpSender / RTCRtpReceiver >>> >>> >>> interface RTCRtpSender{ >>> >>> // ... >>> >>> >>> static RTCRtpParameters createParameters( >>> >>> MediaStreamTrack track, >>> >>> CapabilitiesreceiverCaps, >>> >>> optional (RTCRtpAudioPreferences or >>> >>> RTCRtpVideoPreferencesor >>> >>> RTCRtpSimulcastPreferences) prefs, >>> >>> optional CapabilitiessenderCaps // optional as system can obtain >>> >>> this information >>> >>> ); >>> >>> >>> void start(RTCRtpParametersparams); >>> >>> >>> // ... >>> >>> ); >>> >>> >>> interface RTCRtpReceiver{ >>> >>> // ... >>> >>> >>> static RTCRtpParameters createParameters( >>> >>> DOMString kind, >>> >>> CapabilitiessenderCaps, >>> >>> optional (RTCRtpAudioPreferences or >>> >>> RTCRtpVideoPreferencesor >>> >>> RTCRtpSimulcastPreferences) prefs, >>> >>> optional CapabilitiesreceiverCaps // optional as system can obtain >>> >>> this information >>> >>> ); >>> >>> >>> void start(RTCRtpParametersparams); >>> >>> >>> //... >>> >>> ); >>> >>> >>> RTCRtpMediaPreferences >>> >>> >>> // This is the base dictionary used for both audio and video >>> preferences and represents >>> >>> // the set of common preferences that are available for both media types. >>> >>> dictionary RTCRtpMediaPreferences{ >>> >>> // If not specified, system will choose value. If specified, this >>> receiverIdwill >>> >>> // be applied to primary SSRC “as is”. If more than one SSRC is >>> needed to encode >>> >>> // the stream (e.g. FEC, RTX, MST, simulcast), where the meaning of >>> the RTP packet >>> >>> // with that alternative SSRC cannot be determined by the media >>> flow itself, the >>> >>> // alternative SSRCs will construct a receiverIdvalue based upon >>> >>> this receiverId >>> >>> // value. >>> >>> DOMString receiverId; >>> >>> >>> // This is the primary SSRC to use. Should alternative SSRCs be >>> required (e.g. FEC, >>> >>> // RTX, MST, simulcast), all other SSRCs should be assigned >>> sequentially starting >>> >>> // from the chosen SSRC value. >>> >>> unsigned int ssrc; >>> >>> >>> // For a sender, force the chosen codec to be the codec within the >>> RTCRtpCapabilities >>> >>> // with this name. If possible to choose this codec, the system >>> will confirm by >>> >>> // choosing this codec in the result from "createParameters(...)". >>> >>> // This value has no meaning for a receiver since a receiver must >>> be capable >>> >>> // of receiving any of the compatible codecs within the union >>> RTCRtpCapabilities. >>> >>> // A non specified value indicates the system will choose the >>> preferred sending >>> >>> // codec. >>> >>> DOMString codecName; >>> >>> >>> // This value indicates the relative importance of the media being >>> sent with a >>> >>> // sender versus other media being sent. The logic is that all sent >>> media with >>> >>> // the same priority will be treated as having an equal priority. >>> Those with >>> >>> // a greater value will be given a greater priority and those with a >>> lower value >>> >>> // will be given a lower priority. The value is relative meaning a >>> value of 2.0 >>> >>> // should be given roughly 2 times the priority vs a 1.0 value and a >>> value of 4.0 >>> >>> // should be given roughly 4 times the priority vs a 1.0 value. >>> >>> double relativePriority = 1.0; >>> >>> >>> // This value indicates the maximum bit rate the media is allowed to >>> output as >>> >>> // a combined whole (including all layers, FEC, RTX, etc). The >>> system will filter >>> >>> // out codecs that are not capable of delivering below this bit rate >>> unless no >>> >>> // codec is possible in which case the system will chose the minimal >>> codec bit rate >>> >>> // possible and will override with a different maximum bit rate in >>> the result of >>> >>> // "createParameters(...)". >>> >>> double maxBitrate; // engine, keep under >>> this rate >>> >>> >>> // These values indicates the preferred treatment of FEC/RTX for the >>> RTP packets. For >>> >>> // audio, some audio codecs have built in FEC/RTX mechanisms in >>> which case if the >>> >>> // codec is capable, the codec should enable its FEC/RTX mode if >>> value is set to all >>> >>> // for that codec rather than creating an additional RTP flow. >>> >>> RTCRtpRecoveryOptionsfec = "none"; >>> >>> RTCRtpRecoveryOptionsrtx = "none"; >>> >>> >>> }; >>> >>> >>> enum RTCRtpRecoveryOptions{ >>> >>> "all", // apply to all layers >>> >>> "base", // only apply for base (audio will treat "base" as >>> equivalent to "all") >>> >>> "none" // do not apply to any layer >>> >>> }; >>> >>> >>> >>> RTCRtpAudioPreferences >>> >>> >>> dictionary RTCRtpAudioPreferences : RTCRtpMediaPreferences{ >>> >>> // If not 0, tells the engine to pick and configure codecs that are >>> capable of >>> >>> // the minimum of channels (if possible). If not possible, the >>> minimum number of >>> >>> // channels will be returned in the result of "createParameters(...)". >>> >>> unsigned int minChannels = 0; >>> >>> >>> // If not 0, tells the engine to pick a codec and configure codecs >>> which are >>> >>> // capable of delivering the minimum Hz rate as indicated. If not >>> possible, the >>> >>> // minimum Hz rate will be returned in the result of >>> "createParameters(...)" >>> >>> unsigned int minHzRate = 0; >>> >>> >>> // The engine will choose and configure the codecs best able to >>> deliver the level >>> >>> // of fidelity requested. >>> >>> RTCRtpAudioFidelity fidelity = "speech"; >>> >>> }; >>> >>> >>> enum RTCRtpAudioFidelity{ >>> >>> "speech", // speech only is expected so Hz range only need to >>> support the vocal range >>> >>> "music", // music is expected, choose stereo compatible and >>> minimal 32000 Hz >>> >>> "movie" // music / sound effects expected, choose surround and >>> highest Hz available >>> }; >>> >>> >>> >>> RTCRtpVideoPreferences (and related) >>> >>> >>> dictionary RTCRtpVideoPreferences: RTCRtpMediaPreferences { >>> >>> >>> // minFrameRate, minScale, and minQualityeach indicate that the >>> >>> engine must do >>> >>> // it's best effort to keep the frame rate, scale or quality above a >>> certain minimal >>> >>> // level. When using SVC, these values will hint at the requirements >>> typically needed >>> >>> // for the base layer. >>> >>> // >>> >>> >>> // minFrameRateis specified in frames per second. >>> >>> >>> double minFrameRate = 0; // please engine, keep equal or >>> above this rate >>> >>> // minScaleis a relative value from 0.0 to 1.0 where 1.0 represents >>> >>> full input stream >>> >>> // width/height is requested and 0.0 represents no minimize size is >>> requested. >>> >>> // The value of minScaleis multiplied by the source video window >>> >>> width and height >>> >>> // to calculate a minimal width and height that is relative to >>> source size. >>> >>> double minScale = 0; // please engine, keep equal or >>> above this scale >>> >>> // Alternatively, a specific fixed minimal width and height can be >>> requested. >>> >>> double minWidth = 0; // please engine, keep above X >>> pixels wide >>> >>> double minHeight = 0; // please engine, keep above Y >>> pixels high >>> >>> // minQualityis a relative value from 0.0 to 1.0 where 1.0 means >>> >>> maximum output >>> >>> // quality is requested for a given codec and 0.0 allows any minimal >>> codec quality >>> >>> // output is deemed acceptable. >>> >>> double minQuality = 0; // please engine, keep equal or >>> above this quality >>> >>> >>> // The engine needs values to help decide what to sacrifice when >>> network conditions >>> >>> // are not ideal. The frameRatePriority, scalePriority, and >>> qualityPriorityindicate >>> >>> // the relative importance of each aspect of the video relative to >>> the other (or >>> >>> // 0.0 which means the video aspect has no significance (with >>> exclusion to the minimum >>> >>> // above). The values are relative to each other thus a value of 2.0 >>> vs 1.0 has >>> >>> // roughly 2 times the importance and a value of 4.0 vs 1.0 has >>> roughly 4 times the >>> >>> // importance (relatively speaking). >>> >>> double frameRatePriority = 1.0; // priority of frame rate >>> >>> double scalePriority = 1.0; // priority of scale >>> >>> double qualityPriority = 1.0; // priority of quality >>> >>> >>> // If a type of SVC layering is desired, the >>> frameRateScalabilityOptions, >>> >>> // scalingScalabilityOptions, and qualityScalabilityOptionsshould be >>> >>> set to a >>> >>> // non-null value for each SCV type desired. The details of the >>> >>> // RTCRtpScalabilityOptionsdictionary will indicate the desired >>> >>> details for >>> >>> // each individual SVC type requested. >>> >>> // >>> >>> // Default of nullindicates no SVC of specific type is requested. >>> >>> >>> RTCRtpScalabilityOptions? frameRateScalabilityOptions = null; >>> >>> RTCRtpScalabilityOptions? scalingScalabilityOptions = null; >>> >>> RTCRtpScalabilityOptions? qualityScalabilityOptions = null; >>> >>> }; >>> >>> >>> dictionary RTCRtpScalabilityOptions{ >>> >>> // If the alternative value other than the default value of nullis >>> specified, this >>> >>> // indicates to the engine the precise number of layers desired (if >>> possible for a >>> >>> // given codec to deliver these layers). If null, the engine is free >>> to choose >>> >>> // the default layering statically or dynamically dependent upon the >>> codec >>> >>> // capabilities. >>> >>> unsigned int? layers = null; >>> >>> }; >>> >>> >>> RTCRtpSimulcastPreferences >>> >>> >>> dictionary RTCRtpSimulcastPreferences{ >>> >>> // This value indicates the maximum bit rate all media is allowed to >>> output as >>> >>> // a combined for all simulcast streams. >>> >>> double? maxBitrate = null; // engine, keep >>> under this rate >>> >>> >>> sequence<RTCRtpVideoPreferences> simulcastStreams; >>> >>> }; >>> >>> >>> >>> RTCRtpParameters >>> >>> >>> // Typically this object is constructed by the RTCRtpSenderfor local >>> consumption by >>> >>> // the RTCRtpSenderand by the RTCRtpReceiverfor local consumption by a >>> >>> RTCRtpReceiver. >>> >>> // This is a "shotgun" object, meaning the developer is given the >>> power of a "shotgun" >>> >>> // pointed at their feet and they can mess with this object at their >>> own peril should >>> >>> // they need to modify it for unusual compatibility reasons. Normal >>> use cases should not >>> >>> // require modifying the values within this structure and marshalling >>> this structure for >>> >>> // remote consumption by another browser engine is highly discouraged. >>> >>> dictionary RTCRtpParameters{ >>> >>> // When returned as a result, the system will express the actual >>> chosen preferences >>> >>> // possible to best fulfill the preferences given the capabilities. >>> In other words, >>> >>> // the developer can't always get what they want; but if they try >>> sometimes, they will >>> >>> // get what they need. >>> >>> (RTCRtpAudioPreferences or >>> >>> RTCRtpVideoPreferencesor >>> >>> RTCRtpSimulcastPreferences) preferences; >>> >>> >>> // the capabilities of both sender and receiver [value "as is" when >>> passed >>> >>> // "createParameters(...)]" >>> >>> RTCRtcCapabilitiessenderCapabilities; >>> >>> RTCRtcCapabilitiesreceiverCapabilities; >>> >>> >>> // This value contains all the particularly low level details of how >>> the engine >>> >>> // will encode the media on the wire. >>> >>> (RTCRtpParameterAudioDetailsor >>> >>> RTCRtpParameterVideoDetailsor >>> >>> RTCRtpParameterSimulcastDetails) details; >>> >>> >>> // The chosen RTP features based upon the union of the capabilities. >>> >>> SettingsrtpFeatures; >>> >>> >>> // The chosen RTP extensions and configurations based upon the union of >>> >>> // the capabilities. >>> >>> sequence<RTCRtpHeaderExtensionParameters>? headerExtensions = null; >>> >>> }; >>> >>> >>> >>> RTCRtpParameterDetails >>> >>> >>> // This is the base dictionary of common parameters needed for both >>> audio and video media >>> >>> // types. Audio and video will each have their own set of specific >>> parameters depending >>> >>> // upon the media type. >>> >>> dictionary RTCRtpParameterDetails { >>> >>> DOMString receiverId = ""; // use this receiver ID for RTP >>> stream ("" = N/A) >>> >>> unsigned int ssrc = null; // using this SSRC for RTP stream >>> >>> >>> DOMString fecReceiverId = ""; // use this receiver ID for FEC >>> RTP ("" = N/A) >>> >>> unsigned int? fecSsrc = null; // using this SSRC for FEC >>> (null = N/A) >>> >>> Settings fec; // modes of operation related >>> to FEC >>> >>> >>> DOMString rtxReceiverId = ""; // use this receiver ID for RTX >>> RTP ("" = N/A) >>> >>> unsigned int? rtxSsrc = null; // using this SSRC for FEC >>> (null = N/A) >>> >>> Settings rtx; // modes of operation related >>> to RTX >>> >>> >>> // nullfor a sender. For a receiver, this must contain the source >>> SSRC to >>> >>> // use for RTCP Receiver Reports (RRs). >>> >>> unsigned int? rtcpSsrc = null; >>> >>> >>> // If true, the engine will mux RTCP with RTP on the same >>> RTCIceTransport. If false, >>> >>> // the engine will send RTCP reports on the associated RTCP >>> RTCIceTransportcomponent. >>> >>> boolean rtcpMux = true; >>> >>> }; >>> >>> >>> >>> RTCRtpParameterAudioDetails (and related) >>> >>> >>> dictionary RTCRtpParameterAudioDetails: RTCRtpParameterDetails{ >>> >>> >>> // Contains a list of audio codec options per possible to use >>> codecs. The order >>> >>> // of the codecs is in preferred order. >>> >>> sequence<RTCRtpParameterAudioCodecDetails>codecDetails; >>> >>> }; >>> >>> >>> dictionary RTCRtpParameterCodecDetails { >>> >>> // The name of the codec as related to the codec name(s) contained >>> within the codecs >>> >>> // listed within the RTCRtpCapabilitiesdictionaries. >>> >>> DOMString codecName; >>> >>> >>> unsigned byte payloadType; // actual payload type sent on wire >>> >>> Settings formatsParameters; // detailed settings chosen for >>> related codec >>> >>> }; >>> >>> >>> dictionary RTCRtpParameterAudioCodecDetails : >>> RTCRtpParameterCodecDetails { >>> >>> // nothing anything required at this time? >>> >>> }; >>> >>> >>> RTCRtpParameterVideoDetails (and related) >>> >>> >>> dictionary RTCRtpParameterVideoDetails: RTCRtpParameterDetails { >>> >>> >>> double scale = 1.0; // 0..1 relative scale from source >>> >>> double frameRate = 1.0; // 0..1 relative frame rate >>> from source >>> >>> double quality = 1.0; // 0..1 relative quality from >>> source >>> >>> >>> // Contains a list of video codec options per possible to use >>> codecs. The order >>> >>> // of the codecs is in preferred order. >>> >>> sequence<RTCRtpParameterVideoCodecDetails> codecDetails; >>> >>> }; >>> >>> >>> dictionary RTCRtpParameterVideoCodecDetails : >>> RTCRtpParameterCodecDetails { >>> >>> >>> // When layering is used, this value contains a sequence containing >>> the layer >>> >>> // information as needed for the related codec. >>> >>> sequence<RTCRtpParameterVideoLayerDetails>? layers = null; >>> >>> >>> }; >>> >>> >>> dictionary RTCRtpParameterVideoLayerDetails{ >>> >>> // Value is set if required for describing the dependency tree >>> information for >>> >>> // the codec's layers. >>> >>> DOMString layerId = ""; >>> >>> >>> // Value is nullfor the base layer or if dependencies are not needed >>> to be >>> >>> // described (as may be the case for dynamic SCV codecs). If set, >>> the value >>> >>> // contains a list of layers this layer is dependent upon (thus >>> allowing a >>> >>> // dependency tree/graph to be created). >>> >>> sequence<DOMString>? layerIdDependencies = null; >>> >>> >>> RTCRtpScalabilityType? layerScalabilityType = null; // null would be >>> for base >>> >>> >>> DOMString receiverId = ""; // use this receiver ID in >>> layer ("" = N/A) >>> >>> unsigned int? ssrc = null; // if layer uses its own >>> SSRC (null = N/A) >>> >>> >>> double? frameRate = null; // framerate for layer (for >>> temporal SVC) >>> >>> double? scale = null; // scale applied to layer >>> (for spatial SVC) >>> >>> double? quality = null; // quality applied to layer >>> (for quality SVC) >>> >>> >>> DOMString fecReceiverId = ""; // receiver ID for FEC >>> RTP ("" = N/A) >>> >>> unsigned int? fecSsrc = null; // using this SSRC for >>> FEC (null = N/A) >>> >>> Settings fec; // modes of operation >>> related to FEC >>> >>> >>> DOMString rtxReceiverId = ""; // receiver ID for RTX >>> RTP ("" = N/A) >>> >>> unsigned int? rtxSsrc = null; // using this SSRC for >>> FEC (null = N/A) >>> >>> Settings rtx; // modes of operation >>> related to RTX >>> >>> }; >>> >>> >>> enum RTCRtpScalabilityType{ >>> >>> "temporal", >>> >>> "spatial", >>> >>> "quality" >>> >>> }; >>> >>> >>> >>> RTCRtpParameterSimulcastDetails (and related) >>> >>> >>> dictionary RTCRtpParameterSimulcastDetails { >>> >>> // This sequence contains the details of each simulcasted stream >>> when simulcasting >>> >>> // is used or will contain exactly 1 video stream details when not >>> simulcasting. >>> >>> sequence<RTCRtpParameterVideoDetails>? simulcastStreams; >>> >>> }; >>> >>> >>> >>> RTCRtpCodec Dictionary Tweak >>> >>> >>> dictionary RTCRtpCodec{ >>> >>> DOMString name = ""; >>> >>> >>> // Added to be able to pick payload type based upon sender or >>> receiver so they match >>> >>> // when creating both the sender and receiver parameters. >>> >>> unsigned byte preferredPayloadType; >>> >>> >>> unsigned int? clockRate = null; >>> >>> unsigned int? numChannels = 1; >>> >>> Capabilities formats; >>> >>> }; >>> >>> >>> >>> >>> >>> *Proposal-ORTC Sender / Receiver Use Case [Usage Comparison Analysis] >>> >>> Introduction* >>> >>> After attempting to work through examples of code usage using the >>> current ORTC sender/receiver API, some issues, concerns and >>> deficiencies were discovered. A retuning of the current model was made >>> to attempt to address those findings. The differences are illustrated >>> below in code examples based on various use cases. >>> >>> >>> In the first set of use cases for simple application usages, there are >>> no advantages to a capabilities model (aside from the reduction of >>> complexity an engine might need to implement). As the use cases become >>> more involved, advantages begin to show. In the final example which >>> illustrates using SVC, the clear advantage of capabilities and >>> preferences can be demonstrated. >>> >>> >>> Use Cases >>> >>> >>> Alice wishes to send media to Bob >>> >>> >>> Current Parameter Based API >>> >>> >>> Step 1: (Alice) >>> >>> var track = myObtainMediaTrack(); >>> >>> var senderCaps = RTCRtpSender.getCapabilities(); >>> >>> var senderParams = RTCRtpSender.createParameters(track, senderCaps); >>> >>> >>> mysignal(senderParams); >>> >>> >>> Step 2: (Bob) >>> >>> var senderParams = mysignal(); >>> >>> >>> var receiverCaps = RTPRtcReceiver.getCapabilities(); >>> >>> var receiverParams = RTPRtcReceiver.filterParameters(senderParams, >>> receiverCaps); >>> >>> >>> var receiver = new RTCRtpReceiver(...); >>> >>> receiver.start(receiverParams); >>> >>> >>> mysignal(receiverParams); >>> >>> >>> Step 3: (Alice) >>> >>> var receiverParams = mysignal(); >>> >>> >>> var senderParams = RTPRtcSender.filterParameters(receiverParams, >>> senderCaps); >>> >>> >>> var sender = new RTCRtpSender(...); >>> >>> sender.start(senderParams); >>> >>> >>> Comments >>> >>> Because sender (i.e. Alice) sent her parameters that contained >>> specific SSRC (and possibly receiver ID) information in the her sender >>> parameters, the receiver will latch based upon exact SSRC matching. >>> >>> >>> Proposed Capabilities Based API >>> >>> >>> Step 1: (Alice) >>> >>> var senderCaps = RTCRtpSender.getCapbilities(); >>> >>> >>> mysignal(senderCaps); >>> >>> >>> Step 2: (Bob) >>> >>> var senderCaps = signal(); >>> >>> var receiverParams = RTCRtpReceiver.createParameters("video", >>> senderCaps); >>> >>> >>> var receiver = new RTCRtpReceiver(...); >>> >>> receiver.start(receiverParams); >>> >>> >>> mysignal(receiverParams.receiverCapabilities); >>> >>> >>> Step 3: (Alice) >>> >>> var track = myObtainMediaTrack(); >>> >>> >>> var receiverCaps = mysignal(); >>> >>> var senderParams = RTCRtpSender.createParameters(track, receiverCaps); >>> >>> >>> var sender = new RTCRtpSender(...); >>> >>> sender.start(senderParams); >>> >>> >>> Comments >>> >>> Receiver (Bob) can match an incoming stream because the payload types >>> will match and therefore the incoming stream will latch to the >>> receiver based on payload type alone. >>> >>> >>> >>> Alice wishes to send media to Bob Using Unhandled Eventing >>> >>> >>> Current Parameter Based API >>> >>> >>> Step 1: (Alice) >>> >>> var track = myObtainMediaTrack(); >>> >>> var senderCaps = RTCRtpSender.getCapabilities(); >>> >>> var senderParams = RTCRtpSender.createParameters(track, senderCaps); >>> >>> >>> mysignal(senderParams); >>> >>> >>> Step 2: (Bob) >>> >>> var senderParams = mysignal(); >>> >>> >>> var receiverCaps = RTPRtcReceiver.getCapabilities(); >>> >>> var templateReceiverParams = >>> RTPRtcReceiver.filterParameters(senderParams, receiverCaps); >>> >>> templateReceiverParams.encodings[0].receiverId = ""; >>> >>> templateReceiverParams.encodings[0].ssrc = null; >>> >>> >>> var listener = RTCRtpListener(...); >>> >>> listener.onunhandledrtp = function(event) { >>> >>> var receiver = new RTCRtpReceiver(...); >>> >>> receiver.start(templateReceiverParams); >>> >>> } >>> >>> >>> mysignal(receiverParams); >>> >>> >>> Step 3: (Alice) >>> >>> var receiverParams = mysignal(); >>> >>> >>> var senderParams = RTPRtcSender.filterParameters(receiverParams, >>> senderCaps); >>> >>> >>> var sender = new RTCRtpSender(...); >>> >>> sender.start(senderParams); >>> >>> >>> Comments >>> >>> Because sender (i.e. Alice) sent her parameters that contained >>> specific SSRC (and possibly receiver ID) information in the her sender >>> parameters, the receiver must override the template receiver params >>> and remove the exact SSRC to attach the incoming stream by payload type. >>> >>> >>> Proposed Capabilities Based API >>> >>> >>> Step 1: (Alice) >>> >>> var senderCaps = RTCRtpSender.getCapbilities(); >>> >>> >>> mysignal(senderCaps); >>> >>> >>> Step 2: (Bob) >>> >>> var senderCaps = signal(); >>> >>> >>> var listener = RTCRtpListener(...); >>> >>> listener.onunhandledrtp = function(event) { >>> >>> var receiverParams = RTCRtpReceiver.createParameters("video", >>> senderCaps); >>> >>> var receiver = new RTCRtpReceiver(...); >>> >>> receiver.start(receiverParams); >>> >>> } >>> >>> >>> mysignal(receiverParams.receiverCapabilities); >>> >>> >>> Step 3: (Alice) >>> >>> var track = myObtainMediaTrack(); >>> >>> >>> var receiverCaps = mysignal(); >>> >>> var senderParams = RTCRtpSender.createParameters(track, receiverCaps); >>> >>> >>> var sender = new RTCRtpSender(...); >>> >>> sender.start(senderParams); >>> >>> >>> Comments >>> >>> Receiver (Bob) can match an incoming stream because the payload types >>> will match and therefore the incoming stream will latch to the >>> receiver based on payload type alone. >>> >>> >>> >>> Alice / Bob simultaneously exchange information in parallel >>> >>> To avoid requiring a sequential offer / answer exchange, Alice and Bob >>> wish to simultaneously exchange their RTC information to receiver >>> media from the other party. >>> >>> >>> Current Parameter Based API >>> >>> >>> Step 1: (Alice / Bob) >>> >>> // [Alice] >>> >>> var aliceTrack = myObtainMediaTrack(); >>> >>> var aliceSenderCaps = RTCRtpSender.getCapabilities(); >>> >>> var aliceSenderParams = RTCRtpSender.createParameters(aliceTrack, >>> aliceSenderCaps); >>> >>> >>> var aliceReceiverCaps = RTCRtpReceiver.getCapabilities(); >>> >>> var aliceReceiverParams = RTCRtpReceiver.createParameters("video", >>> aliceReceiverCaps); >>> >>> >>> mysignal(aliceSenderParams); >>> >>> mysignal(aliceReceiverParams); >>> >>> >>> // [Bob] >>> >>> var bobTrack = myObtainMediaTrack(); >>> >>> var bobSenderCaps = RTCRtpSender.getCapabilities(); >>> >>> var bobSenderParams = RTCRtpSender.createParameters(bobTrack, >>> bobSenderCaps); >>> >>> >>> var bobReceiverCaps = RTCRtpReceiver.getCapabilities(); >>> >>> var bobReceiverParams = RTCRtpReceiver.createParameters("video", >>> bobReceiverCaps); >>> >>> >>> mysignal(bobSenderParams); >>> >>> mysignal(bobReceiverParams); >>> >>> >>> Step 2: (Alice / Bob) >>> >>> // [Alice] >>> >>> var bobSenderParams = mysignal(); >>> >>> var bobReceiverParams = mysignal(); >>> >>> >>> bobSenderParmas = RTCRtpReceiver.filterParams(bobSenderParams, >>> aliceReceiverCaps); >>> >>> bobSenderParmas.encodings[0].receiverId = ""; >>> >>> bobSenderParmas.encodings[0].ssrc = null; >>> >>> bobSenderParams = myFixPayloadTypes(bobSenderParmas, >>> aliceReceiverParams); >>> >>> >>> var aliceReceiver = new RTCRtpReceiver(...); >>> >>> aliceReceiver.receive(bobSenderParams); >>> >>> >>> bobReceiverParams = RTCRtpSender.filterParams(bobReceiverParams, >>> aliceSenderCaps); >>> >>> var aliceSender = new RTCRtpSender(...); >>> >>> aliceSender.send(bobReceiverParams); >>> >>> >>> // [Bob] >>> >>> var aliceSenderParams = mysignal(); >>> >>> var aliceReceiverParams = mysignal(); >>> >>> >>> aliceSenderParmas = RTCRtpReceiver.filterParams(aliceSenderParams, >>> bobReceiverCaps); >>> >>> aliceSenderParmas.encodings[0].receiverId = ""; >>> >>> aliceSenderParmas.encodings[0].ssrc = null; >>> >>> aliceSenderParams = myFixPayloadTypes(aliceSenderParmas, >>> bobReceiverParams); >>> >>> >>> var bobReceiver = new RTCRtpReceiver(...); >>> >>> bobReceiver.receive(aliceSenderParams); >>> >>> >>> aliceReceiverParams = RTCRtpSender.filterParams(aliceReceiverParams, >>> aliceSenderCaps); >>> >>> var bobSender = new RTCRtpSender(...); >>> >>> bobSender.send(aliceReceiverParams); >>> >>> >>> //--------------------------------- >>> >>> // [Alice and Bob need this method] >>> >>> >>> function myFixPayloadTypes(senderParams, originalReceiverParams) { >>> >>> // TODO: loop through sender params and then secondarily loop through >>> >>> // original receiver params and set the sender payload type based upon >>> >>> // what is found in the receiver params. >>> >>> // ... >>> >>> return myFixedSenderParams; >>> >>> } >>> >>> >>> Comments >>> >>> The sender includes exact SSRC information and signals that to the >>> remote receiver. The issue is the actual sender is going to base it's >>> sending params upon the receiver params of the remote party which do >>> not contain a specific SSRC (or contains a different SSRC). Thus the >>> SSRC has to be stripped from the received sender params or they will >>> not match and the receiver won't latch onto the incoming stream as the >>> latching must occur by payload type instead. >>> >>> >>> The secondary problem is that the sender is actually using the payload >>> types as defined by the remote party's receiver but the receiver is >>> basing the payload types based upon the remote party's sender. This >>> means the payload types might mismatch and the latching based on >>> payload types may not occur. To fix this problem the web developer has >>> to fix either the sender's payload types or the receiver's payload type. >>> >>> >>> Proposed Capabilities Based API >>> >>> >>> Step 1: (Alice / Bob) >>> >>> // [Alice] >>> >>> var aliceSenderCaps = RTCRtpSender.getCapbilities(); >>> >>> var aliceReceiverCaps = RTCRtpReceiver.getCapabilities(); >>> >>> >>> mysignal(aliceSenderCaps); >>> >>> mysignal(aliceReceiverCaps); >>> >>> >>> // [Bob] >>> >>> var bobSenderCaps = RTCRtpSender.getCapbilities(); >>> >>> var bobReceiverCaps = RTCRtpReceiver.getCapabilities(); >>> >>> >>> mysignal(bobSenderCaps); >>> >>> mysignal(bobReceiverCaps); >>> >>> >>> Step 2: (Alice / Bob) >>> >>> // [Alice] >>> >>> var bobSenderCaps = mysignal(); >>> >>> var bobReceiverCaps = mysignal(); >>> >>> >>> var aliceTrack = myObtainMediaTrack(); >>> >>> >>> var aliceReceiverParams = RTCRtpReceiver.createParameters("video", >>> bobSenderCaps); >>> >>> var aliceReceiver = new RTCRtpReceiver(...); >>> >>> aliceReceiver.receiver(aliceReceiverParams); >>> >>> >>> var aliceSenderParams = RTCRtpSender.createParameters(aliceTrack, >>> bobReceiverCaps); >>> >>> var aliceSender = new RTCRtpSender(...); >>> >>> aliceSender.send(aliceSenderParams); >>> >>> >>> // [Bob] >>> >>> var aliceSenderCaps = mysignal(); >>> >>> var aliceReceiverCaps = mysignal(); >>> >>> >>> var bobTrack = myObtainMediaTrack(); >>> >>> >>> var bobReceiverParams = RTCRtpReceiver.createParameters("video", >>> aliceSenderCaps); >>> >>> var bobReceiver = new RTCRtpReceiver(...); >>> >>> bobReceiver.receiver(bobReceiverParams); >>> >>> >>> var bobSenderParams = RTCRtpSender.createParameters(bobTrack, >>> aliceReceiverCaps); >>> >>> var bobSender = new RTCRtpSender(...); >>> >>> bobSender.send(bobSenderParams); >>> >>> >>> Comments >>> >>> The receiver is able to latch onto the sender based on payload type >>> alone. Unlike the current API, there's no need to strip SSRCs and no >>> need to fiddle and fix the payload type. The code is cleaner and >>> clearer as to what's going on and does not presume the application >>> level programmer has to know why payload types need to match or why >>> SSRCs need to be stripped. >>> >>> >>> Alice wants to use a SVC (Scalable Video Codec) to send to Bob >>> >>> This is for illustration purposes only. Typical benefits of SVC are >>> greater in conference scenarios rather than traditional point to point >>> scenarios. However, this scenario can presume that an intermedia >>> conferencing bridge would be between Alice and Bob. >>> >>> >>> Current Parameter Based API >>> >>> >>> Step 1: (Alice) >>> >>> >>> var senderCaps = RTCRtpSender.getCapabilities(); >>> >>> >>> mySignal(senderCaps); >>> >>> >>> Step 2: (Bob) >>> >>> var senderCaps = mysignal(); >>> >>> >>> var receiverCaps = RTPRtcReceiver.getCapabilities(); >>> >>> var receiverParams = RTPRtcReceiver.createParameters("video", >>> receiverCaps); >>> >>> var receiverParams = RTPRtcReceiver.filterParams(senderCaps); >>> >>> >>> var receiverParams = mySetupSVC(receiverParams); >>> >>> >>> var receiver = new RTCRtpReceiver(...); >>> >>> receiver.start(receiverParams); >>> >>> >>> mysignal(receiverParams); >>> >>> >>> function mySetupSVC(receiverParams) { >>> >>> // 1. search the receiver params for a codec capable of SVC based on >>> pre-knowledge >>> >>> // of the codec types >>> >>> // 2. setup SVC params based on codec's capabilities >>> >>> // TODO - step 1 - code needs to be added here to do this logic >>> >>> var chosenCodec = "h264svc"; // hard code for now >>> >>> >>> // TODO: Not sure this code is even right. How does this layer scale >>> even work? >>> >>> // How is temporal and spatial layering defined together? Don't see a >>> knob for >>> >>> // setting up temporal SVC… >>> >>> receiverParams.receiverId = "foo"; >>> >>> receiverParams.encodings[0] = { >>> >>> "codecName": chosenCodec, >>> >>> "scale": 0.125, >>> >>> "encodingId": "0" >>> >>> }; >>> >>> receiverParams.encodings[1] = { >>> >>> "scale": 0.25, >>> >>> "dependencyEncodingIds": {"0"} >>> >>> }; >>> >>> receiverParams.encodings[2] = { >>> >>> "scale": 0.5, >>> >>> "dependencyEncodingIds": {"0", "1"} >>> >>> }; >>> >>> } >>> >>> >>> Step 3: (Alice) >>> >>> var receiverParams = mysignal(); >>> >>> >>> var senderParams = RTPRtcSender.filterParameters(receiverParams, >>> senderCaps); >>> >>> >>> var track = myObtainMediaTrack(); >>> >>> >>> var sender = new RTCRtpSender(track, ...); >>> >>> sender.start(senderParams); >>> >>> >>> Comments >>> >>> The application developer has to have a ton of presumed knowledge >>> about available codecs, codec capabilities and needs to have a deep >>> understanding of how the engine interprets the layering information. >>> The sender cannot setup the SVC parameters desired because it doesn't >>> know the receiver capabilities. >>> >>> >>> The sample above may not work for SVC codecs which put each layer on a >>> unique SSRC because the receiver did not necessarily pre-dictate the >>> expected SSRCs on each layer so the application developer would have >>> to handle this situation too and assign SSRCs for each layer manually >>> based on knowledge that the codec behaves in this manner. >>> >>> >>> The method to setup temporal or quality SVC is unclear. Appropriate >>> parameter knobs for the application developer appear to be missing. >>> >>> >>> Proposed Capabilities Based API >>> >>> >>> Step 1: (Alice) >>> >>> var senderCaps = RTCRtpSender.getCapbilities(); >>> >>> >>> var senderPrefs = { >>> >>> "receiverId": "foo", >>> >>> "frameRateScalabilityOptions": {"layers": 2}, >>> >>> "scalingScalabilityOptions": {"layers": 2}, >>> >>> }; >>> >>> >>> mysignal(senderCaps); >>> >>> mysignal(senderPrefs); >>> >>> >>> Step 2: (Bob) >>> >>> var senderCaps = signal(); >>> >>> var senderPrefs = signal(); >>> >>> >>> var receiverParams = RTCRtpReceiver.createParameters("video", >>> senderCaps, senderPrefs); >>> >>> >>> var receiver = new RTCRtpReceiver(...); >>> >>> receiver.start(receiverParams); >>> >>> >>> mysignal(receiverParams.receiverCapabilities); >>> >>> >>> Step 3: (Alice) >>> >>> var track = myObtainMediaTrack(); >>> >>> >>> var receiverCaps = mysignal(); >>> >>> >>> var senderParams = RTCRtpSender.createParameters(track, receiverCaps, >>> senderPrefs); >>> >>> >>> var sender = new RTCRtpSender(track, ...); >>> >>> sender.start(senderParams); >>> >>> >>> Comments >>> >>> The application developer doesn't require pre-knowledge of the codecs. >>> The developer can quickly and easily specify the types of SVC >>> properties desired with much simpler knobs. The developer doesn't have >>> to worry if a codec is assigning each layer a unique SSRC or not of if >>> the layering ends up being dynamic or not. >>> >>> >>> Conclusion >>> >>> >>> Overall the proposed capabilities based API has strong advantages. >>> Main advantages are: >>> >>> 1. >>> >>> >>> Simplicity in setup based on "preferences" for the application >>> developer >>> >>> 2. >>> >>> >>> Less brittle designs/implementations since low level parameters >>> are not exchanged, filtered, and interpreted by different browser >>> engines >>> >>> 3. >>> >>> >>> Much less knowledge (and often no pre-knowledge) is required for >>> the application developer to take full advantage of a browser's >>> capabilities >>> >>> >>> There's no strong reason to maintain the current API. The biggest >>> difference will be that browsers will need to generate compatible >>> parameters based on capabilities but that also comes at a big >>> advantage of the browser engines not needing to interpreting and >>> filtering low level parameters from other browser engines. Both new >>> and current use low level parameters to receive or send information so >>> that design aspect remains unchanged. >>> >>> >>> Advantages of Current Parameter Based API >>> >>> 1. >>> >>> Browser engines do not need to generate parameter from >>> capabilities in a "compatible" manner (although low level >>> parameters do need to be filtered in a "compatible" manner so this >>> is not a strong advantage). >>> >>> >>> Disadvantages of Current Parameter Based API >>> >>> 1. >>> >>> Application developer needs pre-knowledge of SVC codecs to be able >>> to chose and setup their properties based upon pre-knowledge of >>> codec capabilities >>> >>> 2. >>> >>> Application developer needs deep understanding of how layering >>> works to setup the layering properties correctly >>> >>> 3. >>> >>> Browser engines need to agree on how to filter low level >>> parameters based upon capabilities in a consistent manner across >>> browsers to ensure compatibility >>> >>> 4. >>> >>> Browser engines need to agree how to interpret low level parameter >>> objects that were generated by other browsers (or other applications) >>> >>> 5. >>> >>> Low level parameter based exchanges introduce greater brittleness >>> between browsers since extending the parameters details could mean >>> breaking existing implementations (instead of capabilities which >>> are typically ignored when not understood) >>> >>> 6. >>> >>> Less innovation / greater brittleness for anything that requires >>> parameter object extensions since many browsers as well as >>> applications will be fiddling, exchanging, and filtering these low >>> level parameter objects. >>> >>> 7. >>> >>> Simulcasting with layering doesn't appear to be supported or it's >>> not obvious how to set up those scenarios. >>> >>> 8. >>> >>> Unclear how to mix and match different SVC modes (e.g. temporal, >>> spatial, and quality) >>> >>> 9. >>> >>> The application developer is uncertain based upon their >>> preferences what the browser engine is capable of delivering >>> (without deep understanding of all codecs and their properties). >>> >>> 10. >>> >>> Header extensions will need manual setup by the application >>> developer despite not knowing that codecs or the engines might >>> need certain extensions to take advantage of codec features or >>> browser engine features. >>> >>> >>> Advantages Proposed Capabilities Based API >>> >>> 1. >>> >>> Application developer can easily setup SVC without needing >>> detailed understanding >>> >>> 2. >>> >>> Typical and even advanced use cases do not require a deep >>> understand of RTC to be able to take advantages of capabilities >>> >>> 3. >>> >>> Less brittle implementations as low level parameter objects are >>> only consumed local by the browsers that generate them or only in >>> situations where specific compatibilities with legacy systems are >>> required which the default generated low level properties read >>> would not be compatible. >>> >>> 4. >>> >>> Simulcast with layering is supported >>> >>> 5. >>> >>> Easy for application developer to mix and match different SVC >>> modes (e.g. temporal, spatial, and quality) >>> >>> 6. >>> >>> Easy to extend support for alternative SVC scalability modes (e.g. >>> colour depth, sharpness, ROI) >>> >>> 7. >>> >>> Application developer knows what the browser engine is capable of >>> delivering given a set of preferences (from resultant preferences >>> as returned from "createParameters(...)" >>> >>> 8. >>> >>> Header extensions can be automatically set up based on needs and >>> capabilities of the browser's RTP engines and codecs. >>> >>> >>> Disadvantages Proposed Capabilities Based API >>> >>> 1. >>> >>> Browser engines need to agree on how to compute "compatible" >>> parameters for a given codec and media preferences. The rules for >>> generation of parameters must be clear. >>> >>> >>> Equal Capabilities of Current and Proposed Based API >>> >>> 1. >>> >>> Application developer can always tweak low level properties on an >>> "as needed" basis for compatibility >>> >>> 2. >>> >>> Both new and current proposals send and receive based on lower >>> level parameters (this does not change). >>> >>> >>> * >>> >>> * >>> * >>> >>> > -- > https://jitsi.org > >
Received on Thursday, 8 May 2014 16:35:28 UTC