RE: Temporal and spatial scalability support on RTCRtpEncodingParameters

Currently support for scalable video coding (SVC) is out of scope of the WebRTC 1.0 API.  Nevertheless, it should be observed
that support for decoding of SVC is supported in multiple browsers (Edge, Chrome, Firefox) without any new API surface, and
SVC encoding is supported in Edge and Chrome (and perhaps other browsers) with minimal additional API support. 

Since in several video codecs (VP8, VP9, AV1), decoders are required to be able to decode anything that the encoder can encode,
it not clear that additional API functionality is required to support decoding of SVC streams.   For example, browsers that support
VP8 (such as Edge) can decode a temporally encoded VP8 stream, even if they cannot support temporal encoding. 

Therefore, the question is really about an API to control SVC encoding.  The ORTC API provides this via addition of
RTCRtpEncodingParameters.dependencyEncodingIds. 

While in theory this provides the application with granular control of SVC encoding, in practice implementers have observed that the
decision on how many layers to encode is intimately connected to bandwidth estimation and congestion control, so it is really best left to the browser. 

Just as it wouldn't make much sense for an application to try to implement its own bandwidth estimation using the Statistics API and then try to control congestion by continuously adjusting RTCRtpEncodings.maxBitrate,
in practice, it makes little sense for an application to attempt to continuously calculate how many layers should be encoded, and then attempt to control that via RTCRtpEncodingParameters.dependencyEncodingIds.

So in practice, implementations supporting SVC encoding tend to fall into two camps: 

1. APIs that say "Turn on SVC encoding, let the browser figure out how many layers to send".   For example, in Edge the selection of the H.264UC codec (a variant of H.264/SVC) represents a request for the 
encoder to consider (but not necessarily to use at any given moment) temporal scalability.  

2. Interpretation of RTCRtpEncodingParameters.dependencyEncodingIds as a hint, meaning "Encode up to this many layers, but no more, with the browser given the flexibility to figure out how many layers can be encoded at a given instant."
I believe that ORTC Lib takes this approach. 

________________________________________
From: Sergio Garcia Murillo [sergio.garcia.murillo@gmail.com]
Sent: Friday, September 1, 2017 1:45 AM
To: public-webrtc@w3.org
Subject: Temporal and spatial scalability support on RTCRtpEncodingParameters

Hello all,

I assume this topics has been already been discussed, but I have been
digging on the mailing list and on the issue tracker and I found no
reference to it.

Are there any plans to support setting temporal and spatial scalability
properties on the RTCRtpEncodingParameters, at least for SRST? Currently
VP8 temporal scalability is supported by chrome (not sure about FF) when
doing simulcast, but there will be no way of  controlling it from the
app side once when transceivers are implemented.

Best regards

Sergio

Received on Saturday, 16 September 2017 21:54:54 UTC