- From: Sergio Garcia Murillo <sergio.garcia.murillo@gmail.com>
- Date: Mon, 4 Apr 2016 17:26:28 +0200
- To: public-ortc@w3.org
- Message-ID: <570287A4.2020800@gmail.com>
Hello all,
We have been working on a new proposal to improve both RTCRtpSender and
RTCRtpReceiver object in order to make ORTC spec cleaner and simpler. I
was not sure what was the preferred method of doing this proposals
(either github issue or mailing list), so I created a gist for it:
https://gist.github.com/murillo128/d9da72ef76df26d2fde848a265c46fc7
Also I am copying the full proposal below:
Best regards
Sergio
Rationale
IMHO it is quite difficult to understand what are the RTCRtpSender /
RTCRtpReceiver.
1. Overview
In the figure above, the RTCRtpSender (Section 5) encodes the track
provided as input, which is transported over a RTCDtlsTransport
5.1 Overview
The RTCRtpSender includes information relating to the RTP sender.
5.1 Overview
An RTCRtpSender instance is associated to a sending MediaStreamTrack
and provides RTC related methods to it.
So, the sender, for example, is a generic object that takes a media
track, and generates all short of RTP packets that you send to another
peer. It can host one or several encoders, send one or more ssrcs,
supporting simulcast and svc.
In that regards, it is quite similar to an RTCPeerconnection, but
instead of using a SDP blob, you pass a kind of json-version of an
m-line to the and the RTCRtpSender will do it's best to send all you
want to send (given that you have correctly discovered all the
restrictions to not cause an InvalidParameters exception).
So instead of matching a RTP-world object, as DTLS and ICETransport
does, it is a kind of catch-all black-box object.
The RTCRtpReceiver shares same complexity, supporting receiving multiple
ssrcs streams and payload types as they share the same RTCRtpParameters
dictionary for setting up sending and receiving. It was recently
suggested (and I agree with) that ORTC start off by supporting the
WebRTC 1.0 simulcast model, which involves sending multiple streams, but
receiving only one.
That implies that and RTCRtpReceiver will only receive one RTP packet
stream (one ssrc) with one or more payolads (OPUS+dtmf for example).
With this change, we can narrow up the definition of an RTCRtpSender and
describe it as the object that handles the reception of a rtp packet
stream. Again, IMHO, that makes much sense and maps concept in the
draft-ietf-rtcweb-rtp-usage.
This proposal takes this idea further, and applies the same concept to
the RTCRtpSender. Instead of allowing multiple rtp packets stream to be
handled by a RTCRtpSender, we only allow one RTCRtpSender to produce a
single rtp packet stream (ssrc).
Now we have a one to one relationship between an RTCRtpSender, an
RTCRtpReceiver and a media RTP packet stream.
MediaTrack === > RTCRtpSender ========(single rtp packet stream -
SSSRC)===> RTCRtpReceiver ===> MediaTrack
In that regards, following the m-line analogy, it would represent one
ssrc-group.
Simulcast and SVC is also supported (check below).
<https://gist.github.com/murillo128/d9da72ef76df26d2fde848a265c46fc7#benefits>Benefits
* Improve RTCRtpSender/RTCRtpReceiver definitions
* Cleaner and simpler APIs
* Make it harder to have parameter inconsistency
* Provide a single and straight forward way of using the API. Given a
DTLS/ICE/RTP stream architecture there is only a single way of
implementing it in ORTC.
<https://gist.github.com/murillo128/d9da72ef76df26d2fde848a265c46fc7#proposal>Proposal
In order to be the less disruptive with the changes we have made only
the following changes to the current API:
* The main change is to move the ssrc, fec and rtx definition from the
encodings to the rtp parameters.
* Add RTCRtpCodecRTXParameters associated to each
RTCRtpCodecParameters to support rtx apt issue (this change can be
also be implemented standalone without the rest of the changes)
* Removed the codec sequence from the parameters and move to the
encoding one. This change could be removed, although we believe it
is important in sake of clarity (more of it later)
//New dictionary
dictionary RTCRtpCodecRTXParameters {
payloadtype payloadType;
unsigned long rtxtime;
};
dictionary RTCRtpCodecParameters {
DOMString name;
payloadtype payloadType;
unsigned long clockRate;
unsigned long maxptime;
unsigned long ptime;
unsigned long numChannels;
sequence<RTCRtcpFeedback > rtcpFeedback;
Dictionary parameters;
RTCRtpCodecRTXParameters rtx;// NEW: rtx.payloadType
};
//Not changed, just added here for completeness
dictionary RTCRtpRtxParameters {
unsigned long ssrc;
payloadtype payloadType;
};
//Not changed, just added here for completeness
dictionary RTCRtpFecParameters {
unsigned long ssrc;
DOMString mechanism;
};
dictionary RTCRtpParameters {
DOMString muxId= "";
unsigned long ssrc;//media ssrc - moved from encodings
RTCRtpFecParameters fec;//includes fec.ssrc - moved from encodings
RTCRtpRtxParameters rtx;//includes rtx.ssrc - from encodings
sequence<RTCRtpHeaderExtensionParameters > headerExtensions;
sequence<RTCRtpEncodingParameters > encodings;
RTCRtcpParameters rtcp;
RTCDegradationPreference degradationPreference= "balanced";
//Removed codecs sequence
};
dictionary RTCRtpEncodingParameters {
RTCRtpCodecParameters codec;// Moved from parameters
RTCPriorityType priority;
unsigned long maxBitrate;
double minQuality= 0;
double resolutionScale;
double framerateScale;
unsigned long maxFramerate;
boolean active= true;
DOMString encodingId;
sequence<DOMString > dependencyEncodingIds;
//Removed ssrc fec rtx
};
<https://gist.github.com/murillo128/d9da72ef76df26d2fde848a265c46fc7#impact-analisis>Impact
analisis
<https://gist.github.com/murillo128/d9da72ef76df26d2fde848a265c46fc7#normal-use-case-1-sender-1-receiver-1-media-codec>Normal
use case (1 sender, 1 receiver, 1 media codec)
As we have removed the sequence of RTCRtpCodecParameters from the
parameters, it is required to pass that information in the encodings
attributes. So the automatic process that is performed internally by the
RTCRtpSender in the current version for this case is not possible:
the browser behaves as though a single encodings[0] entry was
provided, with encodings[0].ssrc set to a browser-determined value,
encodings[0].active set to "true", encodings[0].codecPayloadType set
to codecs[j].payloadType where j is the index of the first codec
that is not "cn", "dtmf", "red", "rtx", or a forward error
correction codec, and all the other parameters.encodings[0]
attributes unset.
However note that in the specification, all the examples uses the
following helper function that perform the required steps:
|RTCRtpParameters function myCapsToSendParams(RTCRtpCapabilities
sendCaps, RTCRtpCapabilities remoteRecvCaps) { // Function returning the
sender RTCRtpParameters, based on the local sender and remote receiver
capabilities. // The goal is to enable a single stream audio and video
call with minimum fuss. // // Steps to be followed: // 1. Determine the
RTP features that the receiver and sender have in common. // 2.
Determine the codecs that the sender and receiver have in common. // 3.
Within each common codec, determine the common formats, header
extensions and rtcpFeedback mechanisms. // 4. Determine the payloadType
to be used, based on the receiver preferredPayloadType. // 5. Set
RTCRtcpParameters such as mux to their default values. // 6. Return
RTCRtpParameters enablig the jointly supported features and codecs. } |
Note that while that filling the encoding with the first media supported
codec is done, it is still needed to process the rtp features (mux,
feedback and header extensions) in order to create a compatible encoding
parameters.
<https://gist.github.com/murillo128/d9da72ef76df26d2fde848a265c46fc7#simulcast>Simulcast
From RFC 7656
|3.6. Simulcast A media source represented as multiple independent
encoded streams constitutes a simulcast [SDP-SIMULCAST] or Modification
Detection Code (MDC) of that media source. Figure 8 shows an example of
a media source that is encoded into three separate simulcast streams,
that are in turn sent on the same media transport flow. When using
simulcast, the RTP streams may be sharing an RTP session and media
transport, or be separated on different RTP sessions and media
transports, or be any combination of these two. One major reason to use
separate media transports is to make use of different quality of service
(QoS) for the different source RTP streams. Some considerations on
separating related RTP streams are discussed in Section 3.12.
+----------------+ | Media Source | +----------------+ Source Stream |
+----------------------+----------------------+ | | | V V V
+------------------+ +------------------+ +------------------+ | Media
Encoder | | Media Encoder | | Media Encoder | +------------------+
+------------------+ +------------------+ | Encoded | Encoded | Encoded
| Stream | Stream | Stream V V V +------------------+
+------------------+ +------------------+ | Media Packetizer | | Media
Packetizer | | Media Packetizer | +------------------+
+------------------+ +------------------+ | Source | Source | Source |
RTP | RTP | RTP | Stream | Stream | Stream +-----------------+ |
+-----------------+ | | | V V V +-------------------+ | Media Transport
| +-------------------+ Figure 8: Example of Media Source Simulcast The
simulcast relation between the RTP streams is the common media source.
In addition, to be able to identify the common media source, a receiver
of the RTP stream may need to know which configuration or encoding goals
lay behind the produced encoded stream and its properties. This enables
selection of the stream that is most useful in the application at that
moment. |
The main point to take into consideration, is that each layer is
provided by an independent encoder. So performance wise, it is
irrelevant if one RTPRtpSender provides two encoding, or two
RTCRtpSenders provides one encoding each.
So it is possible to cover all the use cases provided by the current
spec, for example:
|RTCRtpSender (track0) | +-----endoding[0] = {ssrc1,vp8,pt=96}
+-----endoding[1] = {ssrc1,vp8,pt=97} +-----endoding[2] =
{ssrc2,vp8,pt=98} |
Will be equivalent to two streams attached to same media track, each one
with the encodings for a single ssrc.
|RTCRtpSender (track0,ssrc1) | +-----endoding[0] = {vp8,pt=96}
+-----endoding[1] = {vp8,pt=97} RTCRtpSender (track0,ssrc2) |
+-----endoding[0] = {vp8,pt=98} |
Note that in first case, the payloads even if on different ssrcs, were
required to have different payload types.
<https://gist.github.com/murillo128/d9da72ef76df26d2fde848a265c46fc7#svc>SVC
Also from RFC 7656
|3.7. Layered Multi-Stream Layered Multi-Stream (LMS) is a mechanism by
which different portions of a layered or scalable encoding of a source
stream are sent using separate RTP streams (sometimes in separate RTP
sessions). LMSs are useful for receiver control of layered media. A
media source represented as an encoded stream and multiple dependent
streams constitutes a media source that has layered dependencies. Figure
9 represents an example of a media source that is encoded into three
dependent layers, where two layers are sent on the same media transport
using different RTP streams, i.e., SSRCs, and the third layer is sent on
a separate media transport. +----------------+ | Media Source |
+----------------+ | | V
+---------------------------------------------------------+ | Media
Encoder | +---------------------------------------------------------+ |
| | Encoded Stream Dependent Stream Dependent Stream | | | V V V
+----------------+ +----------------+ +----------------+ |Media
Packetizer| |Media Packetizer| |Media Packetizer| +----------------+
+----------------+ +----------------+ | | | RTP Stream RTP Stream RTP
Stream | | | +------+ +------+ | | | | V V V +-----------------+
+-----------------+ | Media Transport | | Media Transport |
+-----------------+ +-----------------+ Figure 9: Example of Media
Source Layered Dependency It is sometimes useful to make a distinction
between using a single media transport or multiple separate media
transports when (in both cases) using multiple RTP streams to carry
encoded streams and dependent streams for a media source. Therefore, the
following new terminology is defined here: SRST: Single RTP stream on a
Single media Transport MRST: Multiple RTP streams on a Single media
Transport MRMT: Multiple RTP streams on Multiple media Transports MRST
and MRMT relations need to identify the common media encoder origin for
the encoded and dependent streams. When using different RTP sessions
(MRMT), a single RTP stream per media encoder, and a single media source
in each RTP session, common SSRCs and CNAMEs can be used to identify the
common media source. When multiple RTP streams are sent from one media
encoder in the same RTP session (MRST), then CNAME is the only currently
specified RTP identifier that can be used. In cases where multiple media
encoders use multiple media sources sharing synchronization context, and
thus have a common CNAME, additional heuristics or identification need
to be applied to create the MRST or MRMT relationships between the RTP
streams. |
The main advantage with simulcast is that here a single instance of the
encoder is able to serve multiple layers, improving performance compared
to having several independent encoders.
This is supported in current spec by using the dependencyEncodingIds
which allows the browser to correlate SVC layers so they can be provided
by the same encoder:
dependencyEncodingIds of type sequence The encodingIds on which this
layer depends. Within this specification encodingIds are permitted
only within the same RTCRtpEncodingParameters sequence. In the
future if MST were to be supported, then if searching within an
RTCRtpEncodingParameters sequence did not produce a match, then a
global search would be carried out.
Note that currently MST is not supported because the dependency search
is only done inside of the encoders of an RTCRtpSender, and as
RTCRtpSender is attached to a single transport, it is not possible to
send a layer to different transports.
So in current version of ORTC spec, SRST and MRST are supported, but not
MRMT. In new version, only SRST would be supported.
This limitation is artificial, as if the encodingId were globally
unique, that search could be done across RTCRtpSender. That would mean
that SRST, MRST*and MRMT*would be supported with this proposal.
|RTCRtpSender (track0) | +-----endoding[0] =
{ssrc1,vp9,pt=96,encodingId="track0-0"} +-----endoding[1] =
{ssrc1,vp9,pt=97,encodingId="track0-1",dependencyEncodingIds=["track0-0"]}
+-----endoding[2] =
{ssrc2,vp9,pt=98,encodingId="track0-2",dependencyEncodingIds=["track0-0"]} |
Will be equivalent to two streams attached to same media track, each one
with the encodings for a single ssrc.
|RTCRtpSender (track0,ssrc1) | +-----endoding[0] =
{vp9,pt=96,encodingId="track0-0"} +-----endoding[1] =
{vp9,pt=97,encodingId="track0-1",dependencyEncodingIds=["track0-0"]}
RTCRtpSender (track0,ssrc2) | +-----endoding[0] =
{vp9,pt=98,encodingId="track0-2",dependencyEncodingIds=["track0-0"]}|
Received on Monday, 4 April 2016 15:27:02 UTC