- From: Justin Uberti <juberti@google.com>
- Date: Thu, 8 May 2014 12:58:20 -0700
- To: Bernard Aboba <Bernard.Aboba@microsoft.com>
- Cc: Robin Raymond <robin@hookflash.com>, "public-ortc@w3.org" <public-ortc@w3.org>
- Message-ID: <CAOJ7v-1KTrD3TxGpdiv6J4NN8f14VSf6Tt04VJRgDRWGZbnY2A@mail.gmail.com>
I haven't grokked all the details of this proposal yet, but I don't think adding an indirect way of controlling params in the API proper is a good idea. If app developers want to be shielded from complexity, that's what JSL abstractions are for. And it's not clear that we're going to save on complexity by adding yet another config surface. The idea that different implementations will support different parameters is also scary to me. Part of the reason that WebRTC is valuable is that it enforces a shared set of requirements on all applications. The idea that we might mix and match what needs to be supported seems like a dangerous direction. Lastly, the advantages/disadvantages section is fairly one-sided; I think any of the things you suggest can be done with the new approach are also possible with the existing approach. On Thu, May 8, 2014 at 8:10 AM, Bernard Aboba <Bernard.Aboba@microsoft.com>wrote: > This proposal looks like it represents a big upgrade in SVC and > simulcast support: > > 1. Via the scale, frameRate, and quality attributes in > RTCRtpParameterVideoDetails (roughly corresponding to the previous > RTCRtpEcodingParameters), it appears possible to support all modes of SVC, > whereas only spatial scalability was supported previously. > > 2. Via RTCRtpParameterSimulcastDetails, it looks like combinations of > SVC and simulcast (e.g. Temporal scalability + spatial simulcast) can be > supported, which wasn't possible previously. > > 3. Via RTCRtpScalabilityType it appears possible to support combined SVC > modes (e.g. Temporal + Spatial + Quality). > > > On May 7, 2014, at 9:13 PM, "Robin Raymond" <robin@hookflash.com> wrote: > > > I am contributing a proposal on how to resolve an issue discovered in the > usage of "parameters". While the details can always be tweaked, I think it > successfully resolves much of the concern around the level and knowledge > required to configure a "parameters" object for anything other than the > basic use cases. > > In response to this posting: > http://lists.w3.org/Archives/Public/public-ortc/2014May/0007.html > > This also addresses the issue of exchanging detailed parameters over the > wire and instead base parameters based on capabilities. > > I am going to copy the entire proposal below to official contribute the > proposal but for the sake of readability I am also including a link to the > google doc(s). > > Proposal-ORTC Sender / Receiver Capabilities Based Model > > https://docs.google.com/document/d/1htyRaNjXTE_O1GhD8TcLCNXFvVsgszpE8Lqgp3OCHlU/edit?usp=sharing > > Proposal-ORTC Sender / Receiver Use Case [Usage Comparison Analysis] > > https://docs.google.com/document/d/1hdhCHj-gpwv06vIbAftxMG3oZtz7A-nuYsuwQEkTat4/edit?usp=sharing > > > > > Proposal-ORTC Sender / Receiver Capabilities Based Model > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > * Introduction After attempting to write out some use cases using the > existing RTCRtpSender and RTCRtpReciever objects and parameters for ORTC, > some issues were discovered. Specifically, the application developer would > need to have a fair amount of knowledge on exactly how to tweak low level > parameters for anything beyond very simple use cases. For example, setting > up an SVC (Scalable Video Codec) would have required knowing about what > codecs support SVC, how the layering is setup for particular codecs, and > finally setting up specific geometric (or temporal) attributes and layering > relationship details by an application developer. As a result of the lack > of easily configuration of RTP features, the idea came out to give the > application developer "preferences" where the developer could choose what > they want desire with high level knobs and dials and let the engine (which > has explicit knowledge of each codec) configure the low level "parameters" > details according to a developer's wishes. The engine could then return the > closest set of preferences that could be achieved given the capabilities of > the engine and the developer can then choose to proceed or not setting up > media flows using these preferences and constructed parameters. Another > important discovery was made in the process of defining "preferences". If > two ORTC engines were given the same set of preferences and the > capabilities of both sender and receiver, each engine could be made to > construct "compatible" sender and receiver "parameters" details without > ever exchanging the parameter details over the wire. This small realization > about generating "parameters" from capabilities for local consumption by an > engine has a huge impact. This generation removes the need for an engine to > understand and filter settings that it may not understand created by > another engine of unknown origin, which may use proprietary and/or custom > settings. A simple "ignore capabilities you don't understand" rule could > replace complex and cumbersome rules that would be otherwise required if > "parameters" were to be sent over the wire and later filtered using a set > of capabilities. Parameters can be generated based on the union of sender > and receiver capabilities along with application developer preferences > being used as a guideline on how to create the parameters. The engine will > do it’s best to fulfill the preferences and it will return the parameters > that are possible given the union of the capabilities. Two different > engines must be able to compute compatible parameters given all the same > preferences and capabilities. Fortunately, any two engines that understand > the same capabilities can easily follow the same rules to generate > compatible parameters. While the parameters created on the sender and > receiver are required to be "compatible", they need not be identical. The > application developer should call "createParameters(...)" on sender to > create parameters suitable for the sender. The application developer should > call "createParameters(...)" on the receiver to create params suitable for > a receiver. The calculated “parameters” for both sender and receiver have > to be compatible only to the extent that whatever a sender produces a > receiver must be capable of decoding. The application developer has the > option to tweak the detailed parameters output by "createParameters(...)" > but should only do so with extreme caution. The resultant parameters output > by "createParameters(...)" are only meant for local consumption by the > local sender / receiver “start” methods. Sending these created parameters > over the wire is discouraged because implementations may produce objects > which may not be entirely understable by the remote party, even though the > media sent on the wire will be compatible. Differences from Current Sender > / Receiver API Both models and APIs are more similar than they are > different. The subtle differences make important behavioural usage > implications. Both models send and receive based upon "parameter" settings. > The difference is in how the "parameters" are generated. The new model > generates the "parameters" based on an exchange of capabilities and the > application developer is given convenient 'knobs' called "preferences" to > perform most common use cases. The "parameters" in the new model are > intended for local consumption only and the application developer is not > required (and actively discouraged) from marshalling these "parameters" > over the wire. The new model proposes marshaling and exchanging > "capabilities" and optionally "preferences" and then generating compatible > "parameters" based on those exchanges. In both models, the application > developer may choose to tweak low level parameters should specific > compatibilities be required. But the "preferences" model allows most > application developers to completely ignore the low level parameters. > Advantages of the New Capabilities Model Overall the proposed capabilities > based API has strong advantages. Main advantages are: 1. Simplicity in > setup based on "preferences" for the application developer 2. Less brittle > designs/implementations since low level parameters are not exchanged, > filtered, and interpreted by different browser engines 3. Much less > knowledge (and often no pre-knowledge) is required for the application > developer to take full advantage of a browser's capabilities RTCRtpSender / > RTCRtpReceiver interface RTCRtpSender { // ... static RTCRtpParameters > createParameters( MediaStreamTrack track, Capabilities receiverCaps, > optional (RTCRtpAudioPreferences or RTCRtpVideoPreferences > or RTCRtpSimulcastPreferences) prefs, optional Capabilities > senderCaps // optional as system can obtain this information ); void > start(RTCRtpParameters params); // ... ); interface RTCRtpReceiver { // > ... static RTCRtpParameters createParameters( DOMString kind, > Capabilities senderCaps, optional (RTCRtpAudioPreferences or > RTCRtpVideoPreferences or > RTCRtpSimulcastPreferences) prefs, optional Capabilities > receiverCaps // optional as system can obtain this information ); void > start(RTCRtpParameters params); //... ); RTCRtpMediaPreferences // This is > the base dictionary used for both audio and video preferences and > represents // the set of common preferences that are available for both > media types. dictionary RTCRtpMediaPreferences { // If not specified, > system will choose value. If specified, this receiverId will // be > applied to primary SSRC “as is”. If more than one SSRC is needed to encode > // the stream (e.g. FEC, RTX, MST, simulcast), where the meaning of the > RTP packet // with that alternative SSRC cannot be determined by the > media flow itself, the // alternative SSRCs will construct a receiverId > value based upon this receiverId // value. DOMString > receiverId; // This is the primary SSRC to use. Should > alternative SSRCs be required (e.g. FEC, // RTX, MST, simulcast), all > other SSRCs should be assigned sequentially starting // from the chosen > SSRC value. unsigned int ssrc; // For a sender, force the > chosen codec to be the codec within the RTCRtpCapabilities // with this > name. If possible to choose this codec, the system will confirm by // > choosing this codec in the result from "createParameters(...)". // This > value has no meaning for a receiver since a receiver must be capable // > of receiving any of the compatible codecs within the union > RTCRtpCapabilities. // A non specified value indicates the system will > choose the preferred sending // codec. DOMString > codecName; // This value indicates the relative importance of > the media being sent with a // sender versus other media being sent. The > logic is that all sent media with // the same priority will be treated as > having an equal priority. Those with // a greater value will be given a > greater priority and those with a lower value // will be given a lower > priority. The value is relative meaning a value of 2.0 // should be given > roughly 2 times the priority vs a 1.0 value and a value of 4.0 // should > be given roughly 4 times the priority vs a 1.0 value. double > relativePriority = 1.0; // This value indicates the > maximum bit rate the media is allowed to output as // a combined whole > (including all layers, FEC, RTX, etc). The system will filter // out > codecs that are not capable of delivering below this bit rate unless no > // codec is possible in which case the system will chose the minimal > codec bit rate // possible and will override with a different maximum bit > rate in the result of // "createParameters(...)". double > maxBitrate; // engine, keep under this rate > // These values indicates the preferred treatment of FEC/RTX for the RTP > packets. For // audio, some audio codecs have built in FEC/RTX mechanisms > in which case if the // codec is capable, the codec should enable its > FEC/RTX mode if value is set to all // for that codec rather than > creating an additional RTP flow. RTCRtpRecoveryOptions fec = "none"; > RTCRtpRecoveryOptions rtx = "none"; }; enum RTCRtpRecoveryOptions { > "all", // apply to all layers "base", // only apply for base > (audio will treat "base" as equivalent to "all") "none" // do not > apply to any layer }; RTCRtpAudioPreferences dictionary > RTCRtpAudioPreferences : RTCRtpMediaPreferences { // If not 0, tells the > engine to pick and configure codecs that are capable of // the minimum of > channels (if possible). If not possible, the minimum number of // > channels will be returned in the result of "createParameters(...)". > unsigned int minChannels = 0; // If not 0, tells the engine to > pick a codec and configure codecs which are // capable of delivering the > minimum Hz rate as indicated. If not possible, the // minimum Hz rate > will be returned in the result of "createParameters(...)" unsigned int > minHzRate = 0; // The engine will choose and configure the codecs > best able to deliver the level // of fidelity requested. > RTCRtpAudioFidelity fidelity = "speech"; }; enum RTCRtpAudioFidelity { > "speech", // speech only is expected so Hz range only need to support > the vocal range "music", // music is expected, choose stereo compatible > and minimal 32000 Hz "movie" // music / sound effects expected, choose > surround and highest Hz available }; RTCRtpVideoPreferences (and related) > dictionary RTCRtpVideoPreferences : RTCRtpMediaPreferences { // > minFrameRate, minScale, and minQuality each indicate that the engine must > do // it's best effort to keep the frame rate, scale or quality above a > certain minimal // level. When using SVC, these values will hint at the > requirements typically needed // for the base layer. // // > minFrameRate is specified in frames per second. double minFrameRate = > 0; // please engine, keep equal or above this rate // minScale is a > relative value from 0.0 to 1.0 where 1.0 represents full input stream // > width/height is requested and 0.0 represents no minimize size is requested. > // The value of minScale is multiplied by the source video window width > and height // to calculate a minimal width and height that is relative to > source size. double minScale = 0; // please engine, keep equal > or above this scale // Alternatively, a specific fixed minimal width and > height can be requested. double minWidth = 0; // please > engine, keep above X pixels wide double minHeight = 0; // > please engine, keep above Y pixels high // minQuality is a relative value > from 0.0 to 1.0 where 1.0 means maximum output // quality is requested > for a given codec and 0.0 allows any minimal codec quality // output is > deemed acceptable. double minQuality = 0; // please engine, keep > equal or above this quality // The engine needs values to help decide > what to sacrifice when network conditions // are not ideal. The > frameRatePriority, scalePriority, and qualityPriority indicate // the > relative importance of each aspect of the video relative to the other (or > // 0.0 which means the video aspect has no significance (with exclusion > to the minimum // above). The values are relative to each other thus a > value of 2.0 vs 1.0 has // roughly 2 times the importance and a value of > 4.0 vs 1.0 has roughly 4 times the // importance (relatively speaking). > double frameRatePriority = 1.0; // priority of frame rate double > scalePriority = 1.0; // priority of scale double > qualityPriority = 1.0; // priority of quality // If a type of SVC > layering is desired, the frameRateScalabilityOptions, // > scalingScalabilityOptions, and qualityScalabilityOptions should be set to a > // non-null value for each SCV type desired. The details of the // > RTCRtpScalabilityOptions dictionary will indicate the desired details for > // each individual SVC type requested. // // Default of null > indicates no SVC of specific type is requested. RTCRtpScalabilityOptions? > frameRateScalabilityOptions = null; RTCRtpScalabilityOptions? > scalingScalabilityOptions = null; RTCRtpScalabilityOptions? > qualityScalabilityOptions = null; }; dictionary RTCRtpScalabilityOptions { > // If the alternative value other than the default value of null is > specified, this // indicates to the engine the precise number of layers > desired (if possible for a // given codec to deliver these layers). If > null, the engine is free to choose // the default layering statically or > dynamically dependent upon the codec // capabilities. unsigned int? > layers = null; }; RTCRtpSimulcastPreferences dictionary > RTCRtpSimulcastPreferences { // This value indicates the maximum bit rate > all media is allowed to output as // a combined for all simulcast > streams. double? maxBitrate = null; // engine, > keep under this rate sequence<RTCRtpVideoPreferences> simulcastStreams; > }; RTCRtpParameters // Typically this object is constructed by the > RTCRtpSender for local consumption by // the RTCRtpSender and by the > RTCRtpReceiver for local consumption by a RTCRtpReceiver. // This is a > "shotgun" object, meaning the developer is given the power of a "shotgun" > // pointed at their feet and they can mess with this object at their own > peril should // they need to modify it for unusual compatibility reasons. > Normal use cases should not // require modifying the values within this > structure and marshalling this structure for // remote consumption by > another browser engine is highly discouraged. dictionary RTCRtpParameters { > // When returned as a result, the system will express the actual chosen > preferences // possible to best fulfill the preferences given the > capabilities. In other words, // the developer can't always get what they > want; but if they try sometimes, they will // get what they need. > (RTCRtpAudioPreferences or RTCRtpVideoPreferences or > RTCRtpSimulcastPreferences) preferences; // the capabilities of both > sender and receiver [value "as is" when passed // > "createParameters(...)]" RTCRtcCapabilities senderCapabilities; > RTCRtcCapabilities receiverCapabilities; // This value contains all the > particularly low level details of how the engine // will encode the media > on the wire. (RTCRtpParameterAudioDetails or > RTCRtpParameterVideoDetails or RTCRtpParameterSimulcastDetails) > details; // The chosen RTP features based upon the union of the > capabilities. Settings rtpFeatures; // The chosen RTP extensions and > configurations based upon the union of // the capabilities. > sequence<RTCRtpHeaderExtensionParameters>? headerExtensions = null; }; > RTCRtpParameterDetails // This is the base dictionary of common parameters > needed for both audio and video media // types. Audio and video will each > have their own set of specific parameters depending // upon the media type. > dictionary RTCRtpParameterDetails { DOMString receiverId = ""; > // use this receiver ID for RTP stream ("" = N/A) unsigned int > ssrc = null; // using this SSRC for RTP stream DOMString > fecReceiverId = ""; // use this receiver ID for FEC RTP ("" = N/A) > unsigned int? fecSsrc = null; // using this SSRC for FEC (null = > N/A) Settings fec; // modes of operation related > to FEC DOMString rtxReceiverId = ""; // use this receiver ID for > RTX RTP ("" = N/A) unsigned int? rtxSsrc = null; // using this > SSRC for FEC (null = N/A) Settings rtx; // modes > of operation related to RTX // null for a sender. For a receiver, this > must contain the source SSRC to // use for RTCP Receiver Reports (RRs). > unsigned int? rtcpSsrc = null; // If true, the engine will mux RTCP > with RTP on the same RTCIceTransport. If false, // the engine will send > RTCP reports on the associated RTCP RTCIceTransport component. boolean > rtcpMux = true; }; RTCRtpParameterAudioDetails (and related) > dictionary RTCRtpParameterAudioDetails : RTCRtpParameterDetails { // > Contains a list of audio codec options per possible to use codecs. The > order // of the codecs is in preferred order. > sequence<RTCRtpParameterAudioCodecDetails> codecDetails; }; dictionary > RTCRtpParameterCodecDetails { // The name of the codec as related to the > codec name(s) contained within the codecs // listed within the > RTCRtpCapabilities dictionaries. DOMString codecName; unsigned > byte payloadType; // actual payload type sent on wire Settings > formatsParameters; // detailed settings chosen for related codec }; > dictionary RTCRtpParameterAudioCodecDetails : RTCRtpParameterCodecDetails { > // nothing anything required at this time? }; RTCRtpParameterVideoDetails > (and related) dictionary RTCRtpParameterVideoDetails : > RTCRtpParameterDetails { double scale = 1.0; // 0..1 > relative scale from source double frameRate = 1.0; // 0..1 > relative frame rate from source double quality = 1.0; // > 0..1 relative quality from source // Contains a list of video codec > options per possible to use codecs. The order // of the codecs is in > preferred order. sequence<RTCRtpParameterVideoCodecDetails> codecDetails; > }; dictionary RTCRtpParameterVideoCodecDetails : > RTCRtpParameterCodecDetails { // When layering is used, this value > contains a sequence containing the layer // information as needed for the > related codec. sequence<RTCRtpParameterVideoLayerDetails>? layers = null; > }; dictionary RTCRtpParameterVideoLayerDetails { // Value is set if > required for describing the dependency tree information for // the > codec's layers. DOMString layerId = ""; // Value is null > for the base layer or if dependencies are not needed to be // described > (as may be the case for dynamic SCV codecs). If set, the value // > contains a list of layers this layer is dependent upon (thus allowing a > // dependency tree/graph to be created). sequence<DOMString>? > layerIdDependencies = null; RTCRtpScalabilityType? layerScalabilityType > = null; // null would be for base DOMString receiverId = > ""; // use this receiver ID in layer ("" = N/A) unsigned int? > ssrc = null; // if layer uses its own SSRC (null = N/A) > double? frameRate = null; // framerate for layer (for > temporal SVC) double? scale = null; // scale applied > to layer (for spatial SVC) double? quality = null; // > quality applied to layer (for quality SVC) DOMString > fecReceiverId = ""; // receiver ID for FEC RTP ("" = N/A) > unsigned int? fecSsrc = null; // using this SSRC for FEC > (null = N/A) Settings fec; // modes of > operation related to FEC DOMString rtxReceiverId = ""; // > receiver ID for RTX RTP ("" = N/A) unsigned int? rtxSsrc = null; > // using this SSRC for FEC (null = N/A) Settings rtx; > // modes of operation related to RTX }; enum > RTCRtpScalabilityType { "temporal", "spatial", "quality" }; > RTCRtpParameterSimulcastDetails (and related) dictionary > RTCRtpParameterSimulcastDetails { // This sequence contains the details > of each simulcasted stream when simulcasting // is used or will contain > exactly 1 video stream details when not simulcasting. > sequence<RTCRtpParameterVideoDetails>? simulcastStreams; }; RTCRtpCodec > Dictionary Tweak dictionary RTCRtpCodec { DOMString name = ""; // > Added to be able to pick payload type based upon sender or receiver so they > match // when creating both the sender and receiver parameters. > unsigned byte preferredPayloadType; unsigned int? clockRate = null; > unsigned int? numChannels = 1; Capabilities formats; }; *Proposal-ORTC > Sender / Receiver Use Case [Usage Comparison Analysis] > > Introduction > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > * After attempting to work through examples of code usage using the > current ORTC sender/receiver API, some issues, concerns and deficiencies > were discovered. A retuning of the current model was made to attempt to > address those findings. The differences are illustrated below in code > examples based on various use cases. In the first set of use cases for > simple application usages, there are no advantages to a capabilities model > (aside from the reduction of complexity an engine might need to implement). > As the use cases become more involved, advantages begin to show. In the > final example which illustrates using SVC, the clear advantage of > capabilities and preferences can be demonstrated. Use Cases Alice wishes to > send media to Bob Current Parameter Based API Step 1: (Alice) var track = > myObtainMediaTrack(); var senderCaps = RTCRtpSender.getCapabilities(); var > senderParams = RTCRtpSender.createParameters(track, senderCaps); > mysignal(senderParams); Step 2: (Bob) var senderParams = mysignal(); var > receiverCaps = RTPRtcReceiver.getCapabilities(); var receiverParams = > RTPRtcReceiver.filterParameters(senderParams, receiverCaps); var receiver = > new RTCRtpReceiver(...); receiver.start(receiverParams); > mysignal(receiverParams); Step 3: (Alice) var receiverParams = mysignal(); > var senderParams = RTPRtcSender.filterParameters(receiverParams, > senderCaps); var sender = new RTCRtpSender(...); > sender.start(senderParams); Comments Because sender (i.e. Alice) sent her > parameters that contained specific SSRC (and possibly receiver ID) > information in the her sender parameters, the receiver will latch based > upon exact SSRC matching. Proposed Capabilities Based API Step 1: (Alice) > var senderCaps = RTCRtpSender.getCapbilities(); mysignal(senderCaps); Step > 2: (Bob) var senderCaps = signal(); var receiverParams = > RTCRtpReceiver.createParameters("video", senderCaps); var receiver = new > RTCRtpReceiver(...); receiver.start(receiverParams); > mysignal(receiverParams.receiverCapabilities); Step 3: (Alice) var track = > myObtainMediaTrack(); var receiverCaps = mysignal(); var senderParams = > RTCRtpSender.createParameters(track, receiverCaps); var sender = new > RTCRtpSender(...); sender.start(senderParams); Comments Receiver (Bob) can > match an incoming stream because the payload types will match and therefore > the incoming stream will latch to the receiver based on payload type alone. > Alice wishes to send media to Bob Using Unhandled Eventing Current > Parameter Based API Step 1: (Alice) var track = myObtainMediaTrack(); var > senderCaps = RTCRtpSender.getCapabilities(); var senderParams = > RTCRtpSender.createParameters(track, senderCaps); mysignal(senderParams); > Step 2: (Bob) var senderParams = mysignal(); var receiverCaps = > RTPRtcReceiver.getCapabilities(); var templateReceiverParams = > RTPRtcReceiver.filterParameters(senderParams, receiverCaps); > templateReceiverParams.encodings[0].receiverId = ""; > templateReceiverParams.encodings[0].ssrc = null; var listener = > RTCRtpListener(...); listener.onunhandledrtp = function(event) { var > receiver = new RTCRtpReceiver(...); > receiver.start(templateReceiverParams); } mysignal(receiverParams); Step > 3: (Alice) var receiverParams = mysignal(); var senderParams = > RTPRtcSender.filterParameters(receiverParams, senderCaps); var sender = new > RTCRtpSender(...); sender.start(senderParams); Comments Because sender > (i.e. Alice) sent her parameters that contained specific SSRC (and possibly > receiver ID) information in the her sender parameters, the receiver must > override the template receiver params and remove the exact SSRC to attach > the incoming stream by payload type. Proposed Capabilities Based API Step > 1: (Alice) var senderCaps = RTCRtpSender.getCapbilities(); > mysignal(senderCaps); Step 2: (Bob) var senderCaps = signal(); var listener > = RTCRtpListener(...); listener.onunhandledrtp = function(event) { var > receiverParams = RTCRtpReceiver.createParameters("video", senderCaps); var > receiver = new RTCRtpReceiver(...); receiver.start(receiverParams); } > mysignal(receiverParams.receiverCapabilities); Step 3: (Alice) var track = > myObtainMediaTrack(); var receiverCaps = mysignal(); var senderParams = > RTCRtpSender.createParameters(track, receiverCaps); var sender = new > RTCRtpSender(...); sender.start(senderParams); Comments Receiver (Bob) can > match an incoming stream because the payload types will match and therefore > the incoming stream will latch to the receiver based on payload type alone. > Alice / Bob simultaneously exchange information in parallel To avoid > requiring a sequential offer / answer exchange, Alice and Bob wish to > simultaneously exchange their RTC information to receiver media from the > other party. Current Parameter Based API Step 1: (Alice / Bob) // [Alice] > var aliceTrack = myObtainMediaTrack(); var aliceSenderCaps = > RTCRtpSender.getCapabilities(); var aliceSenderParams = > RTCRtpSender.createParameters(aliceTrack, aliceSenderCaps); var > aliceReceiverCaps = RTCRtpReceiver.getCapabilities(); var > aliceReceiverParams = RTCRtpReceiver.createParameters("video", > aliceReceiverCaps); mysignal(aliceSenderParams); > mysignal(aliceReceiverParams); // [Bob] var bobTrack = > myObtainMediaTrack(); var bobSenderCaps = RTCRtpSender.getCapabilities(); > var bobSenderParams = RTCRtpSender.createParameters(bobTrack, > bobSenderCaps); var bobReceiverCaps = RTCRtpReceiver.getCapabilities(); var > bobReceiverParams = RTCRtpReceiver.createParameters("video", > bobReceiverCaps); mysignal(bobSenderParams); mysignal(bobReceiverParams); > Step 2: (Alice / Bob) // [Alice] var bobSenderParams = mysignal(); var > bobReceiverParams = mysignal(); bobSenderParmas = > RTCRtpReceiver.filterParams(bobSenderParams, aliceReceiverCaps); > bobSenderParmas.encodings[0].receiverId = ""; > bobSenderParmas.encodings[0].ssrc = null; bobSenderParams = > myFixPayloadTypes(bobSenderParmas, aliceReceiverParams); var aliceReceiver > = new RTCRtpReceiver(...); aliceReceiver.receive(bobSenderParams); > bobReceiverParams = RTCRtpSender.filterParams(bobReceiverParams, > aliceSenderCaps); var aliceSender = new RTCRtpSender(...); > aliceSender.send(bobReceiverParams); // [Bob] var aliceSenderParams = > mysignal(); var aliceReceiverParams = mysignal(); aliceSenderParmas = > RTCRtpReceiver.filterParams(aliceSenderParams, bobReceiverCaps); > aliceSenderParmas.encodings[0].receiverId = ""; > aliceSenderParmas.encodings[0].ssrc = null; aliceSenderParams = > myFixPayloadTypes(aliceSenderParmas, bobReceiverParams); var bobReceiver = > new RTCRtpReceiver(...); bobReceiver.receive(aliceSenderParams); > aliceReceiverParams = RTCRtpSender.filterParams(aliceReceiverParams, > aliceSenderCaps); var bobSender = new RTCRtpSender(...); > bobSender.send(aliceReceiverParams); //--------------------------------- // > [Alice and Bob need this method] function myFixPayloadTypes(senderParams, > originalReceiverParams) { // TODO: loop through sender params and then > secondarily loop through // original receiver params and set the sender > payload type based upon // what is found in the receiver params. // ... > return myFixedSenderParams; } Comments The sender includes exact SSRC > information and signals that to the remote receiver. The issue is the > actual sender is going to base it's sending params upon the receiver params > of the remote party which do not contain a specific SSRC (or contains a > different SSRC). Thus the SSRC has to be stripped from the received sender > params or they will not match and the receiver won't latch onto the > incoming stream as the latching must occur by payload type instead. The > secondary problem is that the sender is actually using the payload types as > defined by the remote party's receiver but the receiver is basing the > payload types based upon the remote party's sender. This means the payload > types might mismatch and the latching based on payload types may not occur. > To fix this problem the web developer has to fix either the sender's > payload types or the receiver's payload type. Proposed Capabilities Based > API Step 1: (Alice / Bob) // [Alice] var aliceSenderCaps = > RTCRtpSender.getCapbilities(); var aliceReceiverCaps = > RTCRtpReceiver.getCapabilities(); mysignal(aliceSenderCaps); > mysignal(aliceReceiverCaps); // [Bob] var bobSenderCaps = > RTCRtpSender.getCapbilities(); var bobReceiverCaps = > RTCRtpReceiver.getCapabilities(); mysignal(bobSenderCaps); > mysignal(bobReceiverCaps); Step 2: (Alice / Bob) // [Alice] var > bobSenderCaps = mysignal(); var bobReceiverCaps = mysignal(); var > aliceTrack = myObtainMediaTrack(); var aliceReceiverParams = > RTCRtpReceiver.createParameters("video", bobSenderCaps); var aliceReceiver > = new RTCRtpReceiver(...); aliceReceiver.receiver(aliceReceiverParams); var > aliceSenderParams = RTCRtpSender.createParameters(aliceTrack, > bobReceiverCaps); var aliceSender = new RTCRtpSender(...); > aliceSender.send(aliceSenderParams); // [Bob] var aliceSenderCaps = > mysignal(); var aliceReceiverCaps = mysignal(); var bobTrack = > myObtainMediaTrack(); var bobReceiverParams = > RTCRtpReceiver.createParameters("video", aliceSenderCaps); var bobReceiver > = new RTCRtpReceiver(...); bobReceiver.receiver(bobReceiverParams); var > bobSenderParams = RTCRtpSender.createParameters(bobTrack, > aliceReceiverCaps); var bobSender = new RTCRtpSender(...); > bobSender.send(bobSenderParams); Comments The receiver is able to latch > onto the sender based on payload type alone. Unlike the current API, > there's no need to strip SSRCs and no need to fiddle and fix the payload > type. The code is cleaner and clearer as to what's going on and does not > presume the application level programmer has to know why payload types need > to match or why SSRCs need to be stripped. Alice wants to use a SVC > (Scalable Video Codec) to send to Bob This is for illustration purposes > only. Typical benefits of SVC are greater in conference scenarios rather > than traditional point to point scenarios. However, this scenario can > presume that an intermedia conferencing bridge would be between Alice and > Bob. Current Parameter Based API Step 1: (Alice) var senderCaps = > RTCRtpSender.getCapabilities(); mySignal(senderCaps); Step 2: (Bob) var > senderCaps = mysignal(); var receiverCaps = > RTPRtcReceiver.getCapabilities(); var receiverParams = > RTPRtcReceiver.createParameters("video", receiverCaps); var receiverParams > = RTPRtcReceiver.filterParams(senderCaps); var receiverParams = > mySetupSVC(receiverParams); var receiver = new RTCRtpReceiver(...); > receiver.start(receiverParams); mysignal(receiverParams); function > mySetupSVC(receiverParams) { // 1. search the receiver params for a codec > capable of SVC based on pre-knowledge // of the codec types // 2. > setup SVC params based on codec's capabilities // TODO - step 1 - code > needs to be added here to do this logic var chosenCodec = "h264svc"; // > hard code for now // TODO: Not sure this code is even right. How does this > layer scale even work? // How is temporal and spatial layering defined > together? Don't see a knob for // setting up temporal SVC… > receiverParams.receiverId = "foo"; receiverParams.encodings[0] = { > "codecName": chosenCodec, "scale": 0.125, "encodingId": "0" }; > receiverParams.encodings[1] = { "scale": 0.25, > "dependencyEncodingIds": {"0"} }; receiverParams.encodings[2] = { > "scale": 0.5, "dependencyEncodingIds": {"0", "1"} }; } Step 3: > (Alice) var receiverParams = mysignal(); var senderParams = > RTPRtcSender.filterParameters(receiverParams, senderCaps); var track = > myObtainMediaTrack(); var sender = new RTCRtpSender(track, ...); > sender.start(senderParams); Comments The application developer has to have > a ton of presumed knowledge about available codecs, codec capabilities and > needs to have a deep understanding of how the engine interprets the > layering information. The sender cannot setup the SVC parameters desired > because it doesn't know the receiver capabilities. The sample above may not > work for SVC codecs which put each layer on a unique SSRC because the > receiver did not necessarily pre-dictate the expected SSRCs on each layer > so the application developer would have to handle this situation too and > assign SSRCs for each layer manually based on knowledge that the codec > behaves in this manner. The method to setup temporal or quality SVC is > unclear. Appropriate parameter knobs for the application developer appear > to be missing. Proposed Capabilities Based API Step 1: (Alice) var > senderCaps = RTCRtpSender.getCapbilities(); var senderPrefs = { > "receiverId": "foo", "frameRateScalabilityOptions": {"layers": 2}, > "scalingScalabilityOptions": {"layers": 2}, }; mysignal(senderCaps); > mysignal(senderPrefs); Step 2: (Bob) var senderCaps = signal(); var > senderPrefs = signal(); var receiverParams = > RTCRtpReceiver.createParameters("video", senderCaps, senderPrefs); var > receiver = new RTCRtpReceiver(...); receiver.start(receiverParams); > mysignal(receiverParams.receiverCapabilities); Step 3: (Alice) var track = > myObtainMediaTrack(); var receiverCaps = mysignal(); var senderParams = > RTCRtpSender.createParameters(track, receiverCaps, senderPrefs); var sender > = new RTCRtpSender(track, ...); sender.start(senderParams); Comments The > application developer doesn't require pre-knowledge of the codecs. The > developer can quickly and easily specify the types of SVC properties > desired with much simpler knobs. The developer doesn't have to worry if a > codec is assigning each layer a unique SSRC or not of if the layering ends > up being dynamic or not. Conclusion Overall the proposed capabilities based > API has strong advantages. Main advantages are: 1. Simplicity in setup > based on "preferences" for the application developer 2. Less brittle > designs/implementations since low level parameters are not exchanged, > filtered, and interpreted by different browser engines 3. Much less > knowledge (and often no pre-knowledge) is required for the application > developer to take full advantage of a browser's capabilities There's no > strong reason to maintain the current API. The biggest difference will be > that browsers will need to generate compatible parameters based on > capabilities but that also comes at a big advantage of the browser engines > not needing to interpreting and filtering low level parameters from other > browser engines. Both new and current use low level parameters to receive > or send information so that design aspect remains unchanged. Advantages of > Current Parameter Based API 1. Browser engines do not need to generate > parameter from capabilities in a "compatible" manner (although low level > parameters do need to be filtered in a "compatible" manner so this is not a > strong advantage). Disadvantages of Current Parameter Based API 1. > Application developer needs pre-knowledge of SVC codecs to be able to chose > and setup their properties based upon pre-knowledge of codec capabilities > 2. Application developer needs deep understanding of how layering works to > setup the layering properties correctly 3. Browser engines need to agree on > how to filter low level parameters based upon capabilities in a consistent > manner across browsers to ensure compatibility 4. Browser engines need to > agree how to interpret low level parameter objects that were generated by > other browsers (or other applications) 5. Low level parameter based > exchanges introduce greater brittleness between browsers since extending > the parameters details could mean breaking existing implementations > (instead of capabilities which are typically ignored when not understood) > 6. Less innovation / greater brittleness for anything that requires > parameter object extensions since many browsers as well as applications > will be fiddling, exchanging, and filtering these low level parameter > objects. 7. Simulcasting with layering doesn't appear to be supported or > it's not obvious how to set up those scenarios. 8. Unclear how to mix and > match different SVC modes (e.g. temporal, spatial, and quality) 9. The > application developer is uncertain based upon their preferences what the > browser engine is capable of delivering (without deep understanding of all > codecs and their properties). 10. Header extensions will need manual setup > by the application developer despite not knowing that codecs or the engines > might need certain extensions to take advantage of codec features or > browser engine features. Advantages Proposed Capabilities Based API 1. > Application developer can easily setup SVC without needing detailed > understanding 2. Typical and even advanced use cases do not require a deep > understand of RTC to be able to take advantages of capabilities 3. Less > brittle implementations as low level parameter objects are only consumed > local by the browsers that generate them or only in situations where > specific compatibilities with legacy systems are required which the default > generated low level properties read would not be compatible. 4. Simulcast > with layering is supported 5. Easy for application developer to mix and > match different SVC modes (e.g. temporal, spatial, and quality) 6. Easy to > extend support for alternative SVC scalability modes (e.g. colour depth, > sharpness, ROI) 7. Application developer knows what the browser engine is > capable of delivering given a set of preferences (from resultant > preferences as returned from "createParameters(...)" 8. Header extensions > can be automatically set up based on needs and capabilities of the > browser's RTP engines and codecs. Disadvantages Proposed Capabilities Based > API 1. Browser engines need to agree on how to compute "compatible" > parameters for a given codec and media preferences. The rules for > generation of parameters must be clear. Equal Capabilities of Current and > Proposed Based API 1. Application developer can always tweak low level > properties on an "as needed" basis for compatibility 2. Both new and > current proposals send and receive based on lower level parameters (this > does not change). * > > > > >
Received on Thursday, 8 May 2014 19:59:12 UTC