Re: Proposal-ORTC Sender / Receiver Capabilities Based Model from Peter Thatcher on 2014-05-09 (public-ortc@w3.org from May 2014)

From: Peter Thatcher <pthatcher@google.com>
Date: Fri, 9 May 2014 13:59:21 -0700
To: cowwoc <cowwoc@bbs.darktech.org>
Cc: public-ortc@w3.org
Message-ID: <CAJrXDUHszDkH7EGZrxikt5JQ41KEihFci4zrxvRM5p+6SV9+EQ@mail.gmail.com>
We're not making two APIs.  We're making one API that JS libraries can
build on, and they can provide whatever API they want.  But there is a
small amount of convenience in the API that isn't strictly necessary, such
as createParameters and filterParameters.  They make the API usable for
simple cases without a JS library.

So, in your terms, this is the low-level API.  The high level APIs will
handled by JS libraries.

On May 9, 2014 1:39 PM, "cowwoc" <cowwoc@bbs.darktech.org> wrote:
>
> I think an API's "usability" is quite important. At the same time, I
agree that we should isolate the API into a low-level and higher-level
component and that this part probably belongs in the upper layer.
>
> Does it make sense to split the API along a low-level (network) and
high-level (use-cases)?
>
> Gili
>
>
> On 09/05/2014 4:08 PM, Peter Thatcher wrote:
>>
>> There's a difference between power and convenience.  We should provide
the power, but don't need to make it overly complex to provide
convenience.  JS libraries on top can provide convenience,  and different
libraries will have different ideas of what is convenient.
>>
>> On May 9, 2014 12:49 PM, "Erik Lagerway" <erik@hookflash.com> wrote:
>>>
>>>
>>> On Fri, May 9, 2014 at 11:14 AM, Peter Thatcher <pthatcher@google.com>
wrote:
>>>>
>>>> OK, I've had to read through the recent threads and this proposal
fairly thoroughly.  My thoughts are:
>>>>
>>>> 1.  Most, maybe all, of this, like createParameters and
filterParameters in the first place, is convenience and could be
implemented in a JS library.  For such things, we should set a high bar for
value vs. cost.
>>>>
>>>> 2.  This adds a huge amount of extra complexity (cost) to the API
surface.  We'd go from having Parameters and Capabilities to
having Capabilities, Parameters, Preferences, Options, Settings, and
Details.  Your first advantage listed is "simplicity".  I must humbly
disagree.  I think this is all very complex.  However, if this is
simplicity from your perspective, then I have good news: you can implement
most (all?) of it in JS as a library :).
>>>>
>>>> 3.  Most of what you are trying to do seams to be to make it easy to
create an SVC setup.  In other words, SVC/simulcast without touching
RtpParameters.  As far as I know, then only use case for SVC/simulcast is
for multiway video.  Multiway video *is not a simple use case* and I don't
think we should be adding lots of convenience stuff to the API to cover
that use case.  We should certainly make it possible to do multiway video
(provide the power), and certainly JS libraries on top can make it easier
(provide the convenience), but the convenience functions built into the
API, such as createParameters and filterParameters, can only cover a
certain set of simple use cases.  Multiway video is beyond what those can
provide, unless there's a really simple way of doing it (so that the value
outweighs the cost).
>>>
>>>
>>> Hmm, my thinking was that one of the big benefits to ORTC over 1.0 was
the SVC/Simulcast bits, if that is the case then we need to clearly define
how one can make use of ORTC to implement support for those type of
features. Not saying that we must be doing that inside the API or not, but
not defining it somewhere (in detail) would not be helpful either.
>>>
>>>>
>>>>
>>>>
>>>> 4.  Having a simplified "here's the kind of thing I'm looking for" as
an optional parameter to createParameters might be worth it (more value
than cost), but I think it would need to be very simple.  I think that
might be worth exploring in a much more limited capacity.
>>>>
>>>>
>>>> tl;dr: I see a lot of complexity and not a lot of benefit.   The idea
of adding an extra parameter to createParameters might be worth it in a
much more simple version.
>>>>
>>>>
>>>> On Wed, May 7, 2014 at 9:13 PM, Robin Raymond <robin@hookflash.com>
wrote:
>>>>>
>>>>>
>>>>> I am contributing a proposal on how to resolve an issue discovered in
the usage of "parameters". While the details can always be tweaked, I think
it successfully resolves much of the concern around the level and knowledge
required to configure a "parameters" object for anything other than the
basic use cases.
>>>>>
>>>>> In response to this posting:
>>>>> http://lists.w3.org/Archives/Public/public-ortc/2014May/0007.html
>>>>>
>>>>> This also addresses the issue of exchanging detailed parameters over
the wire and instead base parameters based on capabilities.
>>>>>
>>>>> I am going to copy the entire proposal below to official contribute
the proposal but for the sake of readability I am also including a link to
the google doc(s).
>>>>>
>>>>> Proposal-ORTC Sender / Receiver Capabilities Based Model
>>>>>
https://docs.google.com/document/d/1htyRaNjXTE_O1GhD8TcLCNXFvVsgszpE8Lqgp3OCHlU/edit?usp=sharing
>>>>>
>>>>> Proposal-ORTC Sender / Receiver Use Case [Usage Comparison Analysis]
>>>>>
https://docs.google.com/document/d/1hdhCHj-gpwv06vIbAftxMG3oZtz7A-nuYsuwQEkTat4/edit?usp=sharing
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Proposal-ORTC Sender / Receiver Capabilities Based Model
>>>>>
>>>>> Introduction
>>>>>
>>>>> After attempting to write out some use cases using the existing
RTCRtpSender and RTCRtpReciever objects and parameters for ORTC, some
issues were discovered. Specifically, the application developer would need
to have a fair amount of knowledge on exactly how to tweak low level
parameters for anything beyond very simple use cases. For example, setting
up an SVC (Scalable Video Codec) would have required knowing about what
codecs support SVC, how the layering is setup for particular codecs, and
finally setting up specific geometric (or temporal) attributes and layering
relationship details by an application developer.
>>>>>
>>>>>
>>>>> As a result of the lack of easily configuration of RTP features, the
idea came out to give the application developer "preferences" where the
developer could choose what they want desire with high level knobs and
dials and let the engine (which has explicit knowledge of each codec)
configure the low level "parameters" details according to a developer's
wishes. The engine could then return the closest set of preferences that
could be achieved given the capabilities of the engine and the developer
can then choose to proceed or not setting up media flows using these
preferences and constructed parameters.
>>>>>
>>>>>
>>>>> Another important discovery was made in the process of defining
"preferences". If two ORTC engines were given the same set of preferences
and the capabilities of both sender and receiver, each engine could be made
to construct "compatible" sender and receiver "parameters" details without
ever exchanging the parameter details over the wire. This small realization
about generating "parameters" from capabilities for local consumption by an
engine has a huge impact. This generation removes the need for an engine to
understand and filter settings that it may not understand created by
another engine of unknown origin, which may use proprietary and/or custom
settings. A simple "ignore capabilities you don't understand" rule could
replace complex and cumbersome rules that would be otherwise required if
"parameters" were to be sent over the wire and later filtered using a set
of capabilities.
>>>>>
>>>>>
>>>>> Parameters can be generated based on the union of sender and receiver
capabilities along with application developer preferences being used as a
guideline on how to create the parameters. The engine will do it’s best to
fulfill the preferences and it will return the parameters that are possible
given the union of the capabilities.
>>>>>
>>>>>
>>>>> Two different engines must be able to compute compatible parameters
given all the same preferences and capabilities. Fortunately, any two
engines that understand the same capabilities can easily follow the same
rules to generate compatible parameters. While the parameters created on
the sender and receiver are required to be "compatible", they need not be
identical. The application developer should call "createParameters(...)" on
sender to create parameters suitable for the sender. The application
developer should call "createParameters(...)" on the receiver to create
params suitable for a receiver. The calculated “parameters” for both sender
and receiver have to be compatible only to the extent that whatever a
sender produces a receiver must be capable of decoding.
>>>>>
>>>>>
>>>>> The application developer has the option to tweak the detailed
parameters output by "createParameters(...)" but should only do so with
extreme caution. The resultant parameters output by "createParameters(...)"
are only meant for local consumption by the local sender / receiver “start”
methods.  Sending these created parameters over the wire is discouraged
because implementations may produce objects which may not be entirely
understable by the remote party, even though the media sent on the wire
will be compatible.
>>>>>
>>>>>
>>>>> Differences from Current Sender / Receiver API
>>>>>
>>>>> Both models and APIs are more similar than they are different. The
subtle differences make important behavioural usage implications.
>>>>>
>>>>>
>>>>> Both models send and receive based upon "parameter" settings. The
difference is in how the "parameters" are generated. The new model
generates the "parameters" based on an exchange of capabilities and the
application developer is given convenient 'knobs' called "preferences" to
perform most common use cases. The "parameters" in the new model are
intended for local consumption only and the application developer is not
required (and actively discouraged) from marshalling these "parameters"
over the wire. The new model proposes marshaling and exchanging
"capabilities" and optionally "preferences" and then generating compatible
"parameters" based on those exchanges.
>>>>>
>>>>>
>>>>> In both models, the application developer may choose to tweak low
level parameters should specific compatibilities be required. But the
"preferences" model allows most application developers to completely ignore
the low level parameters.
>>>>>
>>>>>
>>>>> Advantages of the New Capabilities Model
>>>>>
>>>>> Overall the proposed capabilities based API has strong advantages.
Main advantages are:
>>>>>
>>>>> Simplicity in setup based on "preferences" for the application
developer
>>>>>
>>>>> Less brittle designs/implementations since low level parameters are
not exchanged, filtered, and interpreted by different browser engines
>>>>>
>>>>> Much less knowledge (and often no pre-knowledge) is required for the
application developer to take full advantage of a browser's capabilities
>>>>>
>>>>>
>>>>> RTCRtpSender / RTCRtpReceiver
>>>>>
>>>>> interface RTCRtpSender {
>>>>>
>>>>>  // ...
>>>>>
>>>>>
>>>>>  static RTCRtpParameters createParameters(
>>>>>
>>>>>    MediaStreamTrack track,
>>>>>
>>>>>    Capabilities receiverCaps,
>>>>>
>>>>>    optional (RTCRtpAudioPreferences or
>>>>>
>>>>>              RTCRtpVideoPreferences or
>>>>>
>>>>>              RTCRtpSimulcastPreferences) prefs,
>>>>>
>>>>>    optional Capabilities senderCaps   // optional as system can
obtain this information
>>>>>
>>>>>  );
>>>>>
>>>>>
>>>>>  void start(RTCRtpParameters params);
>>>>>
>>>>>
>>>>>  // ...
>>>>>
>>>>> );
>>>>>
>>>>>
>>>>> interface RTCRtpReceiver {
>>>>>
>>>>>  // ...
>>>>>
>>>>>
>>>>>  static RTCRtpParameters createParameters(
>>>>>
>>>>>    DOMString kind,
>>>>>
>>>>>    Capabilities senderCaps,
>>>>>
>>>>>    optional (RTCRtpAudioPreferences or
>>>>>
>>>>>              RTCRtpVideoPreferences or
>>>>>
>>>>>              RTCRtpSimulcastPreferences) prefs,
>>>>>
>>>>>    optional Capabilities receiverCaps // optional as system can
obtain this information
>>>>>
>>>>>  );
>>>>>
>>>>>
>>>>>  void start(RTCRtpParameters params);
>>>>>
>>>>>
>>>>>  //...
>>>>>
>>>>> );
>>>>>
>>>>>
>>>>> RTCRtpMediaPreferences
>>>>>
>>>>> // This is the base dictionary used for both audio and video
preferences and represents
>>>>>
>>>>> // the set of common preferences that are available for both media
types.
>>>>>
>>>>> dictionary RTCRtpMediaPreferences {
>>>>>
>>>>>    // If not specified, system will choose value. If specified, this
receiverId will
>>>>>
>>>>>    // be applied to primary SSRC “as is”. If more than one SSRC is
needed to encode
>>>>>
>>>>>    // the stream (e.g. FEC, RTX, MST, simulcast), where the meaning
of the RTP packet
>>>>>
>>>>>    // with that alternative SSRC cannot be determined by the media
flow itself, the
>>>>>
>>>>>    // alternative SSRCs will construct a receiverId value based upon
this receiverId
>>>>>
>>>>>    // value.
>>>>>
>>>>>    DOMString            receiverId;
>>>>>
>>>>>
>>>>>    // This is the primary SSRC to use. Should alternative SSRCs be
required (e.g. FEC,
>>>>>
>>>>>    // RTX, MST, simulcast), all other SSRCs should be assigned
sequentially starting
>>>>>
>>>>>    // from the chosen SSRC value.
>>>>>
>>>>>    unsigned int         ssrc;
>>>>>
>>>>>
>>>>>    // For a sender, force the chosen codec to be the codec within the
RTCRtpCapabilities
>>>>>
>>>>>    // with this name. If possible to choose this codec, the system
will confirm by
>>>>>
>>>>>    // choosing this codec in the result from "createParameters(...)".
>>>>>
>>>>>    // This value has no meaning for a receiver since a receiver must
be capable
>>>>>
>>>>>    // of receiving any of the compatible codecs within the union
RTCRtpCapabilities.
>>>>>
>>>>>    // A non specified value indicates the system will choose the
preferred sending
>>>>>
>>>>>    // codec.
>>>>>
>>>>>    DOMString            codecName;
>>>>>
>>>>>
>>>>>   // This value indicates the relative importance of the media being
sent with a
>>>>>
>>>>>   // sender versus other media being sent. The logic is that all sent
media with
>>>>>
>>>>>   // the same priority will be treated as having an equal priority.
Those with
>>>>>
>>>>>   // a greater value will be given a greater priority and those with
a lower value
>>>>>
>>>>>   // will be given a lower priority. The value is relative meaning a
value of 2.0
>>>>>
>>>>>   // should be given roughly 2 times the priority vs a 1.0 value and
a value of 4.0
>>>>>
>>>>>   // should be given roughly 4 times the priority vs a 1.0 value.
>>>>>
>>>>>   double                relativePriority = 1.0;
>>>>>
>>>>>
>>>>>   // This value indicates the maximum bit rate the media is allowed
to output as
>>>>>
>>>>>   // a combined whole (including all layers, FEC, RTX, etc). The
system will filter
>>>>>
>>>>>   // out codecs that are not capable of delivering below this bit
rate unless no
>>>>>
>>>>>   // codec is possible in which case the system will chose the
minimal codec bit rate
>>>>>
>>>>>   // possible and will override with a different maximum bit rate in
the result of
>>>>>
>>>>>   // "createParameters(...)".
>>>>>
>>>>>   double                maxBitrate;              // engine, keep
under this rate
>>>>>
>>>>>
>>>>>   // These values indicates the preferred treatment of FEC/RTX for
the RTP packets. For
>>>>>
>>>>>   // audio, some audio codecs have built in FEC/RTX mechanisms in
which case if the
>>>>>
>>>>>   // codec is capable, the codec should enable its FEC/RTX mode if
value is set to all
>>>>>
>>>>>   // for that codec rather than creating an additional RTP flow.
>>>>>
>>>>>   RTCRtpRecoveryOptions fec = "none";
>>>>>
>>>>>   RTCRtpRecoveryOptions rtx = "none";
>>>>>
>>>>> };
>>>>>
>>>>>
>>>>> enum RTCRtpRecoveryOptions {
>>>>>
>>>>>   "all",     // apply to all layers
>>>>>
>>>>>   "base",    // only apply for base (audio will treat "base" as
equivalent to "all")
>>>>>
>>>>>   "none"     // do not apply to any layer
>>>>>
>>>>> };
>>>>>
>>>>>
>>>>>
>>>>> RTCRtpAudioPreferences
>>>>>
>>>>> dictionary RTCRtpAudioPreferences : RTCRtpMediaPreferences {
>>>>>
>>>>>   // If not 0, tells the engine to pick and configure codecs that are
capable of
>>>>>
>>>>>   // the minimum of channels (if possible). If not possible, the
minimum number of
>>>>>
>>>>>   // channels will be returned in the result of
"createParameters(...)".
>>>>>
>>>>>   unsigned int         minChannels = 0;
>>>>>
>>>>>
>>>>>   // If not 0, tells the engine to pick a codec and configure codecs
which are
>>>>>
>>>>>   // capable of delivering the minimum Hz rate as indicated. If not
possible, the
>>>>>
>>>>>   // minimum Hz rate will be returned in the result of
"createParameters(...)"
>>>>>
>>>>>   unsigned int         minHzRate = 0;
>>>>>
>>>>>
>>>>>   // The engine will choose and configure the codecs best able to
deliver the level
>>>>>
>>>>>   // of fidelity requested.
>>>>>
>>>>>   RTCRtpAudioFidelity  fidelity = "speech";
>>>>>
>>>>> };
>>>>>
>>>>>
>>>>> enum RTCRtpAudioFidelity {
>>>>>
>>>>>  "speech",   // speech only is expected so Hz range only need to
support the vocal range
>>>>>
>>>>>  "music",    // music is expected, choose stereo compatible and
minimal 32000 Hz
>>>>>
>>>>>  "movie"     // music / sound effects expected, choose surround and
highest Hz available
>>>>> };
>>>>>
>>>>>
>>>>>
>>>>> RTCRtpVideoPreferences (and related)
>>>>>
>>>>> dictionary RTCRtpVideoPreferences : RTCRtpMediaPreferences {
>>>>>
>>>>>
>>>>>   // minFrameRate, minScale, and minQuality each indicate that the
engine must do
>>>>>
>>>>>   // it's best effort to keep the frame rate, scale or quality above
a certain minimal
>>>>>
>>>>>   // level. When using SVC, these values will hint at the
requirements typically needed
>>>>>
>>>>>   // for the base layer.
>>>>>
>>>>>   //
>>>>>
>>>>>
>>>>>   // minFrameRate is specified in frames per second.
>>>>>
>>>>>   double     minFrameRate = 0;    // please engine, keep equal or
above this rate
>>>>>
>>>>>   // minScale is a relative value from 0.0 to 1.0 where 1.0
represents full input stream
>>>>>
>>>>>   // width/height is requested and 0.0 represents no minimize size is
requested.
>>>>>
>>>>>   // The value of minScale is multiplied by the source video window
width and height
>
> ...
Received on Friday, 9 May 2014 20:59:50 UTC