A Big Proposal: A way to control quality/resolution/framreate/simulcast/layering with RtpSender

>From what I have seen so far, the hardest part of good RTC API to get right
is how to control what quality/resolution/framerate/simulcast/layering to
send.  There are lots of use cases to cover, things interact in complex
ways, and there is never ending list of edge cases.  I've seen lots of
failed attempts at solving this.

However, after spending a long time thinking about it and talking to lots
of people smarter than me, I think we have something that covers all the
major use cases while being "as simple as possible, but no simpler", which
I wish to propose now.

This is going to end up being a long email, and will probably lead to a
very long discussion.  But hopefully it will lead to a good end and give
control to applications that they've never had before and have been asking
for for a long time.  I look forward to discussion because this is by no
means the final word.  It's just the best we've been able to come up with
so far.

So.... here we go....


*What we have so far in the API (proposed, at least):*

partial interface RTCRtpReceiver {
  Promise<void> receive(RTCRtpParameters parameters);
}

partial interface RTCRtpSender {
  Promise<void> send(RTCRtpParameters parameters);
}

dictionary RTCRtpParameters {
  DOMString?                                receiverId;
  sequence<RTCRtpCodecParameters>           codecs;
  sequence<RTCRtpHeaderExtensionParameters> headerExtensions;
  sequence<RTCRtpEncodingParameters>        encodings;
}

dictionary RTCRtpEncodingParameters {
  unsigned int ssrc;
  DOMString? codec;
  RTCRtpFecParameters? fec;
  RTCRtpRtxParameters? rtx;

  *// MISSING STUFF HERE*
}


*What we lack:*

- A way to control per-encoding:
  - Resolution
  - Framerate
  - Bitrate
  - Quality
  - On/Off
- A way to control inter-encoding priority
- A way to control things as more or less bandwidth is available
- A way to control things as the input resolution/aspect ratio changes
(rotation)
- A way to express inter-layer dependencies.


*General Proposed Solution:*

1. Provide control of per-encoding scale, bitrate, quality, priority, and
active/inactive.
2. Provide control of what to bias toward as more bandwidth is available
(more resolution or more framerate).
3. Provide a way to specify limited inter-layer dependencies (more complex
relationships are TBD).
4. Let input resolutions and framerates be controlled by the input
MediaStreamTrack, not by the encoding.
5. Let the JS change what is currently being sent on the fly based on
feedback on what is currently being sent (feedback mechanism is TBD).


*Specfics of RTCRtpEncodingParameters:*

dictionary RTCRtpEncodingParameters {
  // Existing Fields
  unsigned int ssrc;
  DOMString? codec;
  RTCRtpFecParameters? fec;
  RTCRtpRtxParameters? rtx;

  *// New Fields*
  // The higher the value, the more the bits will be given to each
  // as available bandwidth goes up.  Default is 1.0.
  double priority;

  // Do this scale of the input resolution, or die trying.
  // 1.0 = full resolution.  Default is unconstrained.
  double scale;

  // Ramp up resolution/quality/framerate until this bitrate.
  // Summed when using dependent layers.
  double maxBitrate;

  // Ramp up resolution/quality/framerate until this quality.
  double maxQuality;

  // Never send less than this quality.
  double minQuality;

  // What to give more bits to, if available.
  // Perhaps make it an enum.
  DOMString bias; // "resolution" or "framerate"

  // If false, don't send any media right now.
  // Disable is different than omitting the encoding; it can keep
  // resources available to re-enable more quickly than re-adding.
  // Plus, it still sends RTCP.
  // Default is active.
  bool active;

  // These are to setup layer dependencies.
  int layerId;
  sequence<int> layerDependencies;  // Just the IDs
}

*Examples:*

// Normal 1:1 video with resolution feedback from the receiver
var encodings = [
  ssrc: 1,
  scale: .5
}];

// Crank up the quality to "11"
var encodings = [
  ssrc: 1,
  maxQuality: 11.0  // TODO: Figure out the scale.
}];

// Send a thumbnail along with regular size
var encodings1 = [
  ssrc: 1,
  priority: 1.0
}]
// Control the resolution and framerate
// with a different track and RtpSender.
var encodings2 = [{
  ssrc: 2,
  // Prioritize the thumbnail over the main video.
  priority: 10.0
}];

// Sign Language
// (need high framerate, but don't get too bad of quality)
var encodings = [{
  minQuality: 0.2,
  bias: "framerate"
}];

// SVC which handles camera rotation
var encodings =[{
  layerId: 0,
  scale: 0.25,
  priority: 3.0
}, {
  layerId: 1,
  layerDependencies: [0]
  scale: 0.5,
  priority: 2.0
}, {
  layerId: 2,
  layerDependencies: [0, 1]
  scale: 1.0,
  priority: 1.0
}]

// SVC w/thumbnail:
var encodings1 =[{
  layerId: 0,
  scale: 0.25,
  priority: 3.0
}, {
  layerId: 1,
  layerDependencies: [0],
  scale: 0.5,
  priority: 2.0
}, {
  layerId: 2,
  layerDependencies: [0, 1],
  scale: 1.0,
  priority: 1.0
}];
// Control the resolution and framerate with a different track and
RtpSender.
var encodings2 =[{
  layerId: 3,
  priority: 10.0
}]

// SVC w/thumbnail temporarily disabled:
var encodings1 =[{
  layerId: 0,
  scale: 0.25,
  priority: 3.0
}, {
  layerId: 1,
  layerDependencies: [0],
  scale: 0.5,
  priority: 2.0
}, {
  layerId: 2,
  layerDependencies: [0, 1],
  scale: 1.0,
  priority: 1.0
}];
// Control the resolution and framerate
// with a different track and RtpSender.
var encodings2 =[{
  layerId: 3,
  priority: 10.0,
  active: false
}]

// Must send a very fixed resolution
// Adjust the resolution using the input track.
var encodings = [{
  scale: 1.0
}];

// Screencast
var encodings = [{
  bias: "resolution"
}];


// Remote Desktop
// (High framerate, must not dowscale)
var encodings = [{
  bias: "framerate"
  scale: 1.0
}];


// Baby Monitor or Security Camera
// Adjust the framerate using the input track.
var encodings = [{ssrc: 1}];

// Audio more important than video
var audioEncodings = [{
  priority: 10.0
}];
var videoEncodings = [{
  priority: 0.1
}];

Video more important than audio
var audioEncodings = [{
  priority: 0.1
}];
var videoEncodings = [{
  priority: 10.0
}];

// Camera Rotation
// Since there is only control of scale, there is no issue with camera //
rotation or cropping.  Everything should work fine with no jank.
var encodings = [{ssrc: 1}];


That's it.  I apologize for the typos that I'm sure I missed in such a long
email.  I look forward to the discussion :).

Received on Friday, 14 February 2014 18:45:18 UTC