Proposal-ORTC Sender / Receiver Capabilities Based Model from Robin Raymond on 2014-05-08 (public-ortc@w3.org from May 2014)

From: Robin Raymond <robin@hookflash.com>
Date: Thu, 08 May 2014 00:13:06 -0400
To: "public-ortc@w3.org" <public-ortc@w3.org>
Message-ID: <536B0452.9080505@hookflash.com>
I am contributing a proposal on how to resolve an issue discovered in 
the usage of "parameters". While the details can always be tweaked, I 
think it successfully resolves much of the concern around the level and 
knowledge required to configure a "parameters" object for anything other 
than the basic use cases.

In response to this posting:
http://lists.w3.org/Archives/Public/public-ortc/2014May/0007.html

This also addresses the issue of exchanging detailed parameters over the 
wire and instead base parameters based on capabilities.

I am going to copy the entire proposal below to official contribute the 
proposal but for the sake of readability I am also including a link to 
the google doc(s).

Proposal-ORTC Sender / Receiver Capabilities Based Model
https://docs.google.com/document/d/1htyRaNjXTE_O1GhD8TcLCNXFvVsgszpE8Lqgp3OCHlU/edit?usp=sharing

Proposal-ORTC Sender / Receiver Use Case [Usage Comparison Analysis]
https://docs.google.com/document/d/1hdhCHj-gpwv06vIbAftxMG3oZtz7A-nuYsuwQEkTat4/edit?usp=sharing




Proposal-ORTC Sender / Receiver Capabilities Based Model
*


  Introduction


After attempting to write out some use cases using the existing 
RTCRtpSender and RTCRtpReciever objects and parameters for ORTC, some 
issues were discovered. Specifically, the application developer would 
need to have a fair amount of knowledge on exactly how to tweak low 
level parameters for anything beyond very simple use cases. For example, 
setting up an SVC (Scalable Video Codec) would have required knowing 
about what codecs support SVC, how the layering is setup for particular 
codecs, and finally setting up specific geometric (or temporal) 
attributes and layering relationship details by an application developer.


As a result of the lack of easily configuration of RTP features, the 
idea came out to give the application developer "preferences" where the 
developer could choose what they want desire with high level knobs and 
dials and let the engine (which has explicit knowledge of each codec) 
configure the low level "parameters" details according to a developer's 
wishes. The engine could then return the closest set of preferences that 
could be achieved given the capabilities of the engine and the developer 
can then choose to proceed or not setting up media flows using these 
preferences and constructed parameters.


Another important discovery was made in the process of defining 
"preferences". If two ORTC engines were given the same set of 
preferences and the capabilities of both sender and receiver, each 
engine could be made to construct "compatible" sender and receiver 
"parameters" details without ever exchanging the parameter details over 
the wire. This small realization about generating "parameters" from 
capabilities for local consumption by an engine has a huge impact. This 
generation removes the need for an engine to understand and filter 
settings that it may not understand created by another engine of unknown 
origin, which may use proprietary and/or custom settings. A simple 
"ignore capabilities you don't understand" rule could replace complex 
and cumbersome rules that would be otherwise required if "parameters" 
were to be sent over the wire and later filtered using a set of 
capabilities.


Parameters can be generated based on the union of sender and receiver 
capabilities along with application developer preferences being used as 
a guideline on how to create the parameters. The engine will do it's 
best to fulfill the preferences and it will return the parameters that 
are possible given the union of the capabilities.


Two different engines must be able to compute compatible parameters 
given all the same preferences and capabilities. Fortunately, any two 
engines that understand the same capabilities can easily follow the same 
rules to generate compatible parameters. While the parameters created on 
the sender and receiver are required to be "compatible", they need not 
be identical. The application developer should call 
"createParameters(...)" on sender to create parameters suitable for the 
sender. The application developer should call "createParameters(...)" on 
the receiver to create params suitable for a receiver. The calculated 
"parameters" for both sender and receiver have to be compatible only to 
the extent that whatever a sender produces a receiver must be capable of 
decoding.


The application developer has the option to tweak the detailed 
parameters output by "createParameters(...)" but should only do so with 
extreme caution. The resultant parameters output by 
"createParameters(...)" are only meant for local consumption by the 
local sender / receiver "start" methods.  Sending these created 
parameters over the wire is discouraged because implementations may 
produce objects which may not be entirely understable by the remote 
party, even though the media sent on the wire will be compatible.


  Differences from Current Sender / Receiver API


Both models and APIs are more similar than they are different. The 
subtle differences make important behavioural usage implications.


Both models send and receive based upon "parameter" settings. The 
difference is in how the "parameters" are generated. The new model 
generates the "parameters" based on an exchange of capabilities and the 
application developer is given convenient 'knobs' called "preferences" 
to perform most common use cases. The "parameters" in the new model are 
intended for local consumption only and the application developer is not 
required (and actively discouraged) from marshalling these "parameters" 
over the wire. The new model proposes marshaling and exchanging 
"capabilities" and optionally "preferences" and then generating 
compatible "parameters" based on those exchanges.


In both models, the application developer may choose to tweak low level 
parameters should specific compatibilities be required. But the 
"preferences" model allows most application developers to completely 
ignore the low level parameters.


  Advantages of the New Capabilities Model


Overall the proposed capabilities based API has strong advantages. Main 
advantages are:

 1.

    Simplicity in setup based on "preferences" for the application developer

 2.

    Less brittle designs/implementations since low level parameters are
    not exchanged, filtered, and interpreted by different browser engines

 3.

    Much less knowledge (and often no pre-knowledge) is required for the
    application developer to take full advantage of a browser's capabilities





  RTCRtpSender / RTCRtpReceiver


interface RTCRtpSender{

  // ...


  static RTCRtpParameters createParameters(

    MediaStreamTrack track,

CapabilitiesreceiverCaps,

optional (RTCRtpAudioPreferences or

              RTCRtpVideoPreferencesor

RTCRtpSimulcastPreferences) prefs,

    optional CapabilitiessenderCaps   // optional as system can obtain 
this information

  );


  void start(RTCRtpParametersparams);


  // ...

);


interface RTCRtpReceiver{

  // ...


  static RTCRtpParameters createParameters(

    DOMString kind,

CapabilitiessenderCaps,

optional (RTCRtpAudioPreferences or

              RTCRtpVideoPreferencesor

RTCRtpSimulcastPreferences) prefs,

    optional CapabilitiesreceiverCaps // optional as system can obtain 
this information

  );


  void start(RTCRtpParametersparams);


  //...

);


  RTCRtpMediaPreferences


// This is the base dictionary used for both audio and video preferences 
and represents

// the set of common preferences that are available for both media types.

dictionary RTCRtpMediaPreferences{

    // If not specified, system will choose value. If specified, this 
receiverIdwill

    // be applied to primary SSRC "as is". If more than one SSRC is 
needed to encode

    // the stream (e.g. FEC, RTX, MST, simulcast), where the meaning of 
the RTP packet

    // with that alternative SSRC cannot be determined by the media flow 
itself, the

    // alternative SSRCs will construct a receiverIdvalue based upon 
this receiverId

    // value.

    DOMString            receiverId;


    // This is the primary SSRC to use. Should alternative SSRCs be 
required (e.g. FEC,

    // RTX, MST, simulcast), all other SSRCs should be assigned 
sequentially starting

    // from the chosen SSRC value.

    unsigned int         ssrc;


    // For a sender, force the chosen codec to be the codec within the 
RTCRtpCapabilities

    // with this name. If possible to choose this codec, the system will 
confirm by

    // choosing this codec in the result from "createParameters(...)".

    // This value has no meaning for a receiver since a receiver must be 
capable

    // of receiving any of the compatible codecs within the union 
RTCRtpCapabilities.

    // A non specified value indicates the system will choose the 
preferred sending

    // codec.

    DOMString            codecName;


   // This value indicates the relative importance of the media being 
sent with a

   // sender versus other media being sent. The logic is that all sent 
media with

   // the same priority will be treated as having an equal priority. 
Those with

   // a greater value will be given a greater priority and those with a 
lower value

   // will be given a lower priority. The value is relative meaning a 
value of 2.0

   // should be given roughly 2 times the priority vs a 1.0 value and a 
value of 4.0

   // should be given roughly 4 times the priority vs a 1.0 value.

   double                relativePriority = 1.0;


   // This value indicates the maximum bit rate the media is allowed to 
output as

   // a combined whole (including all layers, FEC, RTX, etc). The system 
will filter

   // out codecs that are not capable of delivering below this bit rate 
unless no

   // codec is possible in which case the system will chose the minimal 
codec bit rate

   // possible and will override with a different maximum bit rate in 
the result of

   // "createParameters(...)".

   double                maxBitrate;              // engine, keep under 
this rate


   // These values indicates the preferred treatment of FEC/RTX for the 
RTP packets. For

   // audio, some audio codecs have built in FEC/RTX mechanisms in which 
case if the

   // codec is capable, the codec should enable its FEC/RTX mode if 
value is set to all

   // for that codec rather than creating an additional RTP flow.

RTCRtpRecoveryOptionsfec = "none";

RTCRtpRecoveryOptionsrtx = "none";

};


enum RTCRtpRecoveryOptions{

   "all",     // apply to all layers

   "base",    // only apply for base (audio will treat "base" as 
equivalent to "all")

   "none"     // do not apply to any layer

};



  RTCRtpAudioPreferences


dictionary RTCRtpAudioPreferences : RTCRtpMediaPreferences{

   // If not 0, tells the engine to pick and configure codecs that are 
capable of

   // the minimum of channels (if possible). If not possible, the 
minimum number of

   // channels will be returned in the result of "createParameters(...)".

   unsigned int         minChannels = 0;


   // If not 0, tells the engine to pick a codec and configure codecs 
which are

   // capable of delivering the minimum Hz rate as indicated. If not 
possible, the

   // minimum Hz rate will be returned in the result of 
"createParameters(...)"

   unsigned int         minHzRate = 0;


   // The engine will choose and configure the codecs best able to 
deliver the level

   // of fidelity requested.

RTCRtpAudioFidelity fidelity = "speech";

};


enum RTCRtpAudioFidelity{

  "speech",   // speech only is expected so Hz range only need to 
support the vocal range

  "music",    // music is expected, choose stereo compatible and minimal 
32000 Hz

  "movie"     // music / sound effects expected, choose surround and 
highest Hz available
};



  RTCRtpVideoPreferences (and related)


dictionary RTCRtpVideoPreferences: RTCRtpMediaPreferences {


   // minFrameRate, minScale, and minQualityeach indicate that the 
engine must do

   // it's best effort to keep the frame rate, scale or quality above a 
certain minimal

   // level. When using SVC, these values will hint at the requirements 
typically needed

   // for the base layer.

   //


   // minFrameRateis specified in frames per second.

   double     minFrameRate = 0;    // please engine, keep equal or above 
this rate

   // minScaleis a relative value from 0.0 to 1.0 where 1.0 represents 
full input stream

   // width/height is requested and 0.0 represents no minimize size is 
requested.

   // The value of minScaleis multiplied by the source video window 
width and height

   // to calculate a minimal width and height that is relative to source 
size.

   double     minScale = 0;        // please engine, keep equal or above 
this scale

   // Alternatively, a specific fixed minimal width and height can be 
requested.

   double     minWidth = 0;        // please engine, keep above X pixels 
wide

   double     minHeight = 0;       // please engine, keep above Y pixels 
high

   // minQualityis a relative value from 0.0 to 1.0 where 1.0 means 
maximum output

   // quality is requested for a given codec and 0.0 allows any minimal 
codec quality

   // output is deemed acceptable.

   double     minQuality = 0;      // please engine, keep equal or above 
this quality


   // The engine needs values to help decide what to sacrifice when 
network conditions

   // are not ideal. The frameRatePriority, scalePriority, and 
qualityPriorityindicate

   // the relative importance of each aspect of the video relative to 
the other (or

   // 0.0 which means the video aspect has no significance (with 
exclusion to the minimum

   // above). The values are relative to each other thus a value of 2.0 
vs 1.0 has

   // roughly 2 times the importance and a value of 4.0 vs 1.0 has 
roughly 4 times the

   // importance (relatively speaking).

   double    frameRatePriority = 1.0; // priority of frame rate

   double    scalePriority = 1.0;     // priority of scale

   double    qualityPriority = 1.0;   // priority of quality


   // If a type of SVC layering is desired, the frameRateScalabilityOptions,

   // scalingScalabilityOptions, and qualityScalabilityOptionsshould be 
set to a

   // non-null value for each SCV type desired. The details of the

   // RTCRtpScalabilityOptionsdictionary will indicate the desired 
details for

   // each individual SVC type requested.

   //

   // Default of nullindicates no SVC of specific type is requested.

RTCRtpScalabilityOptions? frameRateScalabilityOptions = null;

RTCRtpScalabilityOptions? scalingScalabilityOptions = null;

RTCRtpScalabilityOptions? qualityScalabilityOptions = null;

};


dictionary RTCRtpScalabilityOptions{

   // If the alternative value other than the default value of nullis 
specified, this

   // indicates to the engine the precise number of layers desired (if 
possible for a

   // given codec to deliver these layers). If null, the engine is free 
to choose

   // the default layering statically or dynamically dependent upon the 
codec

   // capabilities.

   unsigned int?     layers = null;

};


  RTCRtpSimulcastPreferences


dictionary RTCRtpSimulcastPreferences{

   // This value indicates the maximum bit rate all media is allowed to 
output as

   // a combined for all simulcast streams.

   double?               maxBitrate = null;           // engine, keep 
under this rate


   sequence<RTCRtpVideoPreferences> simulcastStreams;

};



  RTCRtpParameters


// Typically this object is constructed by the RTCRtpSenderfor local 
consumption by

// the RTCRtpSenderand by the RTCRtpReceiverfor local consumption by a 
RTCRtpReceiver.

// This is a "shotgun" object, meaning the developer is given the power 
of a "shotgun"

// pointed at their feet and they can mess with this object at their own 
peril should

// they need to modify it for unusual compatibility reasons. Normal use 
cases should not

// require modifying the values within this structure and marshalling 
this structure for

// remote consumption by another browser engine is highly discouraged.

dictionary RTCRtpParameters{

   // When returned as a result, the system will express the actual 
chosen preferences

   // possible to best fulfill the preferences given the capabilities. 
In other words,

   // the developer can't always get what they want; but if they try 
sometimes, they will

   // get what they need.

   (RTCRtpAudioPreferences or

    RTCRtpVideoPreferencesor

RTCRtpSimulcastPreferences) preferences;


   // the capabilities of both sender and receiver [value "as is" when 
passed

   // "createParameters(...)]"

   RTCRtcCapabilitiessenderCapabilities;

   RTCRtcCapabilitiesreceiverCapabilities;


   // This value contains all the particularly low level details of how 
the engine

   // will encode the media on the wire.

   (RTCRtpParameterAudioDetailsor

RTCRtpParameterVideoDetailsor

RTCRtpParameterSimulcastDetails) details;


   // The chosen RTP features based upon the union of the capabilities.

SettingsrtpFeatures;


   // The chosen RTP extensions and configurations based upon the union of

   // the capabilities.

   sequence<RTCRtpHeaderExtensionParameters>? headerExtensions = null;

};



  RTCRtpParameterDetails


// This is the base dictionary of common parameters needed for both 
audio and video media

// types. Audio and video will each have their own set of specific 
parameters depending

// upon the media type.

dictionary RTCRtpParameterDetails {

   DOMString        receiverId = "";    // use this receiver ID for RTP 
stream ("" = N/A)

   unsigned int     ssrc = null;        // using this SSRC for RTP stream


   DOMString        fecReceiverId = ""; // use this receiver ID for FEC 
RTP ("" = N/A)

   unsigned int?    fecSsrc = null;     // using this SSRC for FEC (null 
= N/A)

   Settings         fec;                // modes of operation related to FEC


   DOMString        rtxReceiverId = ""; // use this receiver ID for RTX 
RTP ("" = N/A)

   unsigned int?    rtxSsrc = null;     // using this SSRC for FEC (null 
= N/A)

   Settings         rtx;                // modes of operation related to RTX


   // nullfor a sender. For a receiver, this must contain the source SSRC to

   // use for RTCP Receiver Reports (RRs).

   unsigned int?    rtcpSsrc = null;


   // If true, the engine will mux RTCP with RTP on the same 
RTCIceTransport. If false,

   // the engine will send RTCP reports on the associated RTCP 
RTCIceTransportcomponent.

   boolean          rtcpMux = true;

};



  RTCRtpParameterAudioDetails (and related)


dictionary RTCRtpParameterAudioDetails: RTCRtpParameterDetails{


   // Contains a list of audio codec options per possible to use codecs. 
The order

   // of the codecs is in preferred order.

   sequence<RTCRtpParameterAudioCodecDetails>codecDetails;

};


dictionary RTCRtpParameterCodecDetails {

   // The name of the codec as related to the codec name(s) contained 
within the codecs

   // listed within the RTCRtpCapabilitiesdictionaries.

   DOMString       codecName;


   unsigned byte   payloadType;       // actual payload type sent on wire

   Settings        formatsParameters; // detailed settings chosen for 
related codec

};


dictionary RTCRtpParameterAudioCodecDetails : RTCRtpParameterCodecDetails {

   // nothing anything required at this time?

};


  RTCRtpParameterVideoDetails (and related)


dictionary RTCRtpParameterVideoDetails: RTCRtpParameterDetails {


   double           scale = 1.0;        // 0..1 relative scale from source

   double           frameRate = 1.0;    // 0..1 relative frame rate from 
source

   double           quality = 1.0;      // 0..1 relative quality from source


   // Contains a list of video codec options per possible to use codecs. 
The order

   // of the codecs is in preferred order.

   sequence<RTCRtpParameterVideoCodecDetails> codecDetails;

};


dictionary RTCRtpParameterVideoCodecDetails : RTCRtpParameterCodecDetails {


   // When layering is used, this value contains a sequence containing 
the layer

   // information as needed for the related codec.

   sequence<RTCRtpParameterVideoLayerDetails>? layers = null;


};


dictionary RTCRtpParameterVideoLayerDetails{

   // Value is set if required for describing the dependency tree 
information for

   // the codec's layers.

   DOMString              layerId = "";


   // Value is nullfor the base layer or if dependencies are not needed 
to be

   // described (as may be the case for dynamic SCV codecs). If set, the 
value

   // contains a list of layers this layer is dependent upon (thus 
allowing a

   // dependency tree/graph to be created).

   sequence<DOMString>?   layerIdDependencies = null;


RTCRtpScalabilityType? layerScalabilityType = null;  // null would be 
for base


   DOMString              receiverId = "";  // use this receiver ID in 
layer ("" = N/A)

   unsigned int?          ssrc = null;      // if layer uses its own 
SSRC (null = N/A)


   double?                frameRate = null; // framerate for layer (for 
temporal SVC)

   double?                scale = null;     // scale applied to layer 
(for spatial SVC)

   double?                quality = null;   // quality applied to layer 
(for quality SVC)


   DOMString              fecReceiverId = ""; // receiver ID for FEC RTP 
("" = N/A)

   unsigned int?          fecSsrc = null;     // using this SSRC for FEC 
(null = N/A)

   Settings               fec;                // modes of operation 
related to FEC


   DOMString              rtxReceiverId = ""; // receiver ID for RTX RTP 
("" = N/A)

   unsigned int?          rtxSsrc = null;     // using this SSRC for FEC 
(null = N/A)

   Settings               rtx;                // modes of operation 
related to RTX

};


enum RTCRtpScalabilityType{

   "temporal",

   "spatial",

   "quality"

};



  RTCRtpParameterSimulcastDetails (and related)


dictionary RTCRtpParameterSimulcastDetails {

   // This sequence contains the details of each simulcasted stream when 
simulcasting

   // is used or will contain exactly 1 video stream details when not 
simulcasting.

   sequence<RTCRtpParameterVideoDetails>? simulcastStreams;

};



  RTCRtpCodec Dictionary Tweak


dictionary RTCRtpCodec{

    DOMString     name = "";


    // Added to be able to pick payload type based upon sender or 
receiver so they match

    // when creating both the sender and receiver parameters.

    unsigned byte preferredPayloadType;


    unsigned int? clockRate = null;

    unsigned int? numChannels = 1;

    Capabilities  formats;

};





*Proposal-ORTC Sender / Receiver Use Case [Usage Comparison Analysis]

Introduction*

After attempting to work through examples of code usage using the 
current ORTC sender/receiver API, some issues, concerns and deficiencies 
were discovered. A retuning of the current model was made to attempt to 
address those findings. The differences are illustrated below in code 
examples based on various use cases.


In the first set of use cases for simple application usages, there are 
no advantages to a capabilities model (aside from the reduction of 
complexity an engine might need to implement). As the use cases become 
more involved, advantages begin to show. In the final example which 
illustrates using SVC, the clear advantage of capabilities and 
preferences can be demonstrated.


  Use Cases


    Alice wishes to send media to Bob


      Current Parameter Based API


        Step 1: (Alice)

var track = myObtainMediaTrack();

var senderCaps = RTCRtpSender.getCapabilities();

var senderParams = RTCRtpSender.createParameters(track, senderCaps);


mysignal(senderParams);


        Step 2: (Bob)

var senderParams = mysignal();


var receiverCaps = RTPRtcReceiver.getCapabilities();

var receiverParams = RTPRtcReceiver.filterParameters(senderParams, 
receiverCaps);


var receiver = new RTCRtpReceiver(...);

receiver.start(receiverParams);


mysignal(receiverParams);


        Step 3: (Alice)

var receiverParams = mysignal();


var senderParams = RTPRtcSender.filterParameters(receiverParams, 
senderCaps);


var sender = new RTCRtpSender(...);

sender.start(senderParams);


        Comments

Because sender (i.e. Alice) sent her parameters that contained specific 
SSRC (and possibly receiver ID) information in the her sender 
parameters, the receiver will latch based upon exact SSRC matching.


      Proposed Capabilities Based API


        Step 1: (Alice)

var senderCaps = RTCRtpSender.getCapbilities();


mysignal(senderCaps);


        Step 2: (Bob)

var senderCaps = signal();

var receiverParams = RTCRtpReceiver.createParameters("video", senderCaps);


var receiver = new RTCRtpReceiver(...);

receiver.start(receiverParams);


mysignal(receiverParams.receiverCapabilities);


        Step 3: (Alice)

var track = myObtainMediaTrack();


var receiverCaps = mysignal();

var senderParams = RTCRtpSender.createParameters(track, receiverCaps);


var sender = new RTCRtpSender(...);

sender.start(senderParams);


        Comments

Receiver (Bob) can match an incoming stream because the payload types 
will match and therefore the incoming stream will latch to the receiver 
based on payload type alone.



    Alice wishes to send media to Bob Using Unhandled Eventing


      Current Parameter Based API


        Step 1: (Alice)

var track = myObtainMediaTrack();

var senderCaps = RTCRtpSender.getCapabilities();

var senderParams = RTCRtpSender.createParameters(track, senderCaps);


mysignal(senderParams);


        Step 2: (Bob)

var senderParams = mysignal();


var receiverCaps = RTPRtcReceiver.getCapabilities();

var templateReceiverParams = 
RTPRtcReceiver.filterParameters(senderParams, receiverCaps);

templateReceiverParams.encodings[0].receiverId = "";

templateReceiverParams.encodings[0].ssrc = null;


var listener = RTCRtpListener(...);

listener.onunhandledrtp = function(event) {

  var receiver = new RTCRtpReceiver(...);

  receiver.start(templateReceiverParams);

}


mysignal(receiverParams);


        Step 3: (Alice)

var receiverParams = mysignal();


var senderParams = RTPRtcSender.filterParameters(receiverParams, 
senderCaps);


var sender = new RTCRtpSender(...);

sender.start(senderParams);


        Comments

Because sender (i.e. Alice) sent her parameters that contained specific 
SSRC (and possibly receiver ID) information in the her sender 
parameters, the receiver must override the template receiver params and 
remove the exact SSRC to attach the incoming stream by payload type.


      Proposed Capabilities Based API


        Step 1: (Alice)

var senderCaps = RTCRtpSender.getCapbilities();


mysignal(senderCaps);


        Step 2: (Bob)

var senderCaps = signal();


var listener = RTCRtpListener(...);

listener.onunhandledrtp = function(event) {

  var receiverParams = RTCRtpReceiver.createParameters("video", senderCaps);

  var receiver = new RTCRtpReceiver(...);

  receiver.start(receiverParams);

}


mysignal(receiverParams.receiverCapabilities);


        Step 3: (Alice)

var track = myObtainMediaTrack();


var receiverCaps = mysignal();

var senderParams = RTCRtpSender.createParameters(track, receiverCaps);


var sender = new RTCRtpSender(...);

sender.start(senderParams);


        Comments

Receiver (Bob) can match an incoming stream because the payload types 
will match and therefore the incoming stream will latch to the receiver 
based on payload type alone.



    Alice / Bob simultaneously exchange information in parallel

To avoid requiring a sequential offer / answer exchange, Alice and Bob 
wish to simultaneously exchange their RTC information to receiver media 
from the other party.


      Current Parameter Based API


        Step 1: (Alice / Bob)

// [Alice]

var aliceTrack = myObtainMediaTrack();

var aliceSenderCaps = RTCRtpSender.getCapabilities();

var aliceSenderParams = RTCRtpSender.createParameters(aliceTrack, 
aliceSenderCaps);


var aliceReceiverCaps = RTCRtpReceiver.getCapabilities();

var aliceReceiverParams = RTCRtpReceiver.createParameters("video", 
aliceReceiverCaps);


mysignal(aliceSenderParams);

mysignal(aliceReceiverParams);


// [Bob]

var bobTrack = myObtainMediaTrack();

var bobSenderCaps = RTCRtpSender.getCapabilities();

var bobSenderParams = RTCRtpSender.createParameters(bobTrack, 
bobSenderCaps);


var bobReceiverCaps = RTCRtpReceiver.getCapabilities();

var bobReceiverParams = RTCRtpReceiver.createParameters("video", 
bobReceiverCaps);


mysignal(bobSenderParams);

mysignal(bobReceiverParams);


        Step 2: (Alice / Bob)

// [Alice]

var bobSenderParams = mysignal();

var bobReceiverParams = mysignal();


bobSenderParmas = RTCRtpReceiver.filterParams(bobSenderParams, 
aliceReceiverCaps);

bobSenderParmas.encodings[0].receiverId = "";

bobSenderParmas.encodings[0].ssrc = null;

bobSenderParams = myFixPayloadTypes(bobSenderParmas, aliceReceiverParams);


var aliceReceiver = new RTCRtpReceiver(...);

aliceReceiver.receive(bobSenderParams);


bobReceiverParams = RTCRtpSender.filterParams(bobReceiverParams, 
aliceSenderCaps);

var aliceSender = new RTCRtpSender(...);

aliceSender.send(bobReceiverParams);


// [Bob]

var aliceSenderParams = mysignal();

var aliceReceiverParams = mysignal();


aliceSenderParmas = RTCRtpReceiver.filterParams(aliceSenderParams, 
bobReceiverCaps);

aliceSenderParmas.encodings[0].receiverId = "";

aliceSenderParmas.encodings[0].ssrc = null;

aliceSenderParams = myFixPayloadTypes(aliceSenderParmas, bobReceiverParams);


var bobReceiver = new RTCRtpReceiver(...);

bobReceiver.receive(aliceSenderParams);


aliceReceiverParams = RTCRtpSender.filterParams(aliceReceiverParams, 
aliceSenderCaps);

var bobSender = new RTCRtpSender(...);

bobSender.send(aliceReceiverParams);


//---------------------------------

// [Alice and Bob need this method]


function myFixPayloadTypes(senderParams, originalReceiverParams) {

   // TODO: loop through sender params and then secondarily loop through

   // original receiver params and set the sender payload type based upon

   // what is found in the receiver params.

   // ...

   return myFixedSenderParams;

}


        Comments

The sender includes exact SSRC information and signals that to the 
remote receiver. The issue is the actual sender is going to base it's 
sending params upon the receiver params of the remote party which do not 
contain a specific SSRC (or contains a different SSRC). Thus the SSRC 
has to be stripped from the received sender params or they will not 
match and the receiver won't latch onto the incoming stream as the 
latching must occur by payload type instead.


The secondary problem is that the sender is actually using the payload 
types as defined by the remote party's receiver but the receiver is 
basing the payload types based upon the remote party's sender. This 
means the payload types might mismatch and the latching based on payload 
types may not occur. To fix this problem the web developer has to fix 
either the sender's payload types or the receiver's payload type.


      Proposed Capabilities Based API


        Step 1: (Alice / Bob)

// [Alice]

var aliceSenderCaps = RTCRtpSender.getCapbilities();

var aliceReceiverCaps = RTCRtpReceiver.getCapabilities();


mysignal(aliceSenderCaps);

mysignal(aliceReceiverCaps);


// [Bob]

var bobSenderCaps = RTCRtpSender.getCapbilities();

var bobReceiverCaps = RTCRtpReceiver.getCapabilities();


mysignal(bobSenderCaps);

mysignal(bobReceiverCaps);


        Step 2: (Alice / Bob)

// [Alice]

var bobSenderCaps = mysignal();

var bobReceiverCaps = mysignal();


var aliceTrack = myObtainMediaTrack();


var aliceReceiverParams = RTCRtpReceiver.createParameters("video", 
bobSenderCaps);

var aliceReceiver = new RTCRtpReceiver(...);

aliceReceiver.receiver(aliceReceiverParams);


var aliceSenderParams = RTCRtpSender.createParameters(aliceTrack, 
bobReceiverCaps);

var aliceSender = new RTCRtpSender(...);

aliceSender.send(aliceSenderParams);


// [Bob]

var aliceSenderCaps = mysignal();

var aliceReceiverCaps = mysignal();


var bobTrack = myObtainMediaTrack();


var bobReceiverParams = RTCRtpReceiver.createParameters("video", 
aliceSenderCaps);

var bobReceiver = new RTCRtpReceiver(...);

bobReceiver.receiver(bobReceiverParams);


var bobSenderParams = RTCRtpSender.createParameters(bobTrack, 
aliceReceiverCaps);

var bobSender = new RTCRtpSender(...);

bobSender.send(bobSenderParams);


        Comments

The receiver is able to latch onto the sender based on payload type 
alone. Unlike the current API, there's no need to strip SSRCs and no 
need to fiddle and fix the payload type. The code is cleaner and clearer 
as to what's going on and does not presume the application level 
programmer has to know why payload types need to match or why SSRCs need 
to be stripped.


    Alice wants to use a SVC (Scalable Video Codec) to send to Bob

This is for illustration purposes only. Typical benefits of SVC are 
greater in conference scenarios rather than traditional point to point 
scenarios. However, this scenario can presume that an intermedia 
conferencing bridge would be between Alice and Bob.


      Current Parameter Based API


        Step 1: (Alice)


var senderCaps = RTCRtpSender.getCapabilities();


mySignal(senderCaps);


        Step 2: (Bob)

var senderCaps = mysignal();


var receiverCaps = RTPRtcReceiver.getCapabilities();

var receiverParams = RTPRtcReceiver.createParameters("video", receiverCaps);

var receiverParams = RTPRtcReceiver.filterParams(senderCaps);


var receiverParams = mySetupSVC(receiverParams);


var receiver = new RTCRtpReceiver(...);

receiver.start(receiverParams);


mysignal(receiverParams);


function mySetupSVC(receiverParams) {

  // 1. search the receiver params for a codec capable of SVC based on 
pre-knowledge

  //    of the codec types

  // 2. setup SVC params based on codec's capabilities

  // TODO - step 1 - code needs to be added here to do this logic

  var chosenCodec = "h264svc"; // hard code for now


  // TODO: Not sure this code is even right. How does this layer scale 
even work?

  // How is temporal and spatial layering defined together? Don't see a 
knob for

  // setting up temporal SVC...

  receiverParams.receiverId = "foo";

  receiverParams.encodings[0] = {

    "codecName": chosenCodec,

    "scale": 0.125,

    "encodingId": "0"

  };

  receiverParams.encodings[1] = {

    "scale": 0.25,

    "dependencyEncodingIds": {"0"}

  };

  receiverParams.encodings[2] = {

    "scale": 0.5,

    "dependencyEncodingIds": {"0", "1"}

  };

}


        Step 3: (Alice)

var receiverParams = mysignal();


var senderParams = RTPRtcSender.filterParameters(receiverParams, 
senderCaps);


var track = myObtainMediaTrack();


var sender = new RTCRtpSender(track, ...);

sender.start(senderParams);


        Comments

The application developer has to have a ton of presumed knowledge about 
available codecs, codec capabilities and needs to have a deep 
understanding of how the engine interprets the layering information. The 
sender cannot setup the SVC parameters desired because it doesn't know 
the receiver capabilities.


The sample above may not work for SVC codecs which put each layer on a 
unique SSRC because the receiver did not necessarily pre-dictate the 
expected SSRCs on each layer so the application developer would have to 
handle this situation too and assign SSRCs for each layer manually based 
on knowledge that the codec behaves in this manner.


The method to setup temporal or quality SVC is unclear. Appropriate 
parameter knobs for the application developer appear to be missing.


      Proposed Capabilities Based API


        Step 1: (Alice)

var senderCaps = RTCRtpSender.getCapbilities();


var senderPrefs = {

  "receiverId": "foo",

  "frameRateScalabilityOptions": {"layers": 2},

  "scalingScalabilityOptions": {"layers": 2},

};


mysignal(senderCaps);

mysignal(senderPrefs);


        Step 2: (Bob)

var senderCaps = signal();

var senderPrefs = signal();


var receiverParams = RTCRtpReceiver.createParameters("video", 
senderCaps, senderPrefs);


var receiver = new RTCRtpReceiver(...);

receiver.start(receiverParams);


mysignal(receiverParams.receiverCapabilities);


        Step 3: (Alice)

var track = myObtainMediaTrack();


var receiverCaps = mysignal();


var senderParams = RTCRtpSender.createParameters(track, receiverCaps, 
senderPrefs);


var sender = new RTCRtpSender(track, ...);

sender.start(senderParams);


        Comments

The application developer doesn't require pre-knowledge of the codecs. 
The developer can quickly and easily specify the types of SVC properties 
desired with much simpler knobs. The developer doesn't have to worry if 
a codec is assigning each layer a unique SSRC or not of if the layering 
ends up being dynamic or not.


  Conclusion

Overall the proposed capabilities based API has strong advantages. Main 
advantages are:

 1.

    Simplicity in setup based on "preferences" for the application developer

 2.

    Less brittle designs/implementations since low level parameters are
    not exchanged, filtered, and interpreted by different browser engines

 3.

    Much less knowledge (and often no pre-knowledge) is required for the
    application developer to take full advantage of a browser's capabilities


There's no strong reason to maintain the current API. The biggest 
difference will be that browsers will need to generate compatible 
parameters based on capabilities but that also comes at a big advantage 
of the browser engines not needing to interpreting and filtering low 
level parameters from other browser engines. Both new and current use 
low level parameters to receive or send information so that design 
aspect remains unchanged.


      Advantages of Current Parameter Based API

 1.

    Browser engines do not need to generate parameter from capabilities
    in a "compatible" manner (although low level parameters do need to
    be filtered in a "compatible" manner so this is not a strong advantage).


      Disadvantages of Current Parameter Based API

 1.

    Application developer needs pre-knowledge of SVC codecs to be able
    to chose and setup their properties based upon pre-knowledge of
    codec capabilities

 2.

    Application developer needs deep understanding of how layering works
    to setup the layering properties correctly

 3.

    Browser engines need to agree on how to filter low level parameters
    based upon capabilities in a consistent manner across browsers to
    ensure compatibility

 4.

    Browser engines need to agree how to interpret low level parameter
    objects that were generated by other browsers (or other applications)

 5.

    Low level parameter based exchanges introduce greater brittleness
    between browsers since extending the parameters details could mean
    breaking existing implementations (instead of capabilities which are
    typically ignored when not understood)

 6.

    Less innovation / greater brittleness for anything that requires
    parameter object extensions since many browsers as well as
    applications will be fiddling, exchanging, and filtering these low
    level parameter objects.

 7.

    Simulcasting with layering doesn't appear to be supported or it's
    not obvious how to set up those scenarios.

 8.

    Unclear how to mix and match different SVC modes (e.g. temporal,
    spatial, and quality)

 9.

    The application developer is uncertain based upon their preferences
    what the browser engine is capable of delivering (without deep
    understanding of all codecs and their properties).

10.

    Header extensions will need manual setup by the application
    developer despite not knowing that codecs or the engines might need
    certain extensions to take advantage of codec features or browser
    engine features.


      Advantages Proposed Capabilities Based API

 1.

    Application developer can easily setup SVC without needing detailed
    understanding

 2.

    Typical and even advanced use cases do not require a deep understand
    of RTC to be able to take advantages of capabilities

 3.

    Less brittle implementations as low level parameter objects are only
    consumed local by the browsers that generate them or only in
    situations where specific compatibilities with legacy systems are
    required which the default generated low level properties read would
    not be compatible.

 4.

    Simulcast with layering is supported

 5.

    Easy for application developer to mix and match different SVC modes
    (e.g. temporal, spatial, and quality)

 6.

    Easy to extend support for alternative SVC scalability modes (e.g.
    colour depth, sharpness, ROI)

 7.

    Application developer knows what the browser engine is capable of
    delivering given a set of preferences (from resultant preferences as
    returned from "createParameters(...)"

 8.

    Header extensions can be automatically set up based on needs and
    capabilities of the browser's RTP engines and codecs.


      Disadvantages Proposed Capabilities Based API

 1.

    Browser engines need to agree on how to compute "compatible"
    parameters for a given codec and media preferences. The rules for
    generation of parameters must be clear.


      Equal Capabilities of Current and Proposed Based API

 1.

    Application developer can always tweak low level properties on an
    "as needed" basis for compatibility

 2.

    Both new and current proposals send and receive based on lower level
    parameters (this does not change).


*

*
*
Received on Thursday, 8 May 2014 04:13:48 UTC