"Automatic" use of scalable video coding? from Bernard Aboba on 2014-09-10 (public-ortc@w3.org from September 2014)

From: Bernard Aboba <Bernard.Aboba@microsoft.com>
Date: Wed, 10 Sep 2014 22:00:57 +0000
To: "public-ortc@w3.org" <public-ortc@w3.org>
Message-ID: <b921f058b5c24fde980ec4d7625d9cfa@SN2PR03MB031.namprd03.prod.outlook.com>

Currently within the ORTC API specification, Section 9.9 provides examples of how to set up encoding parameters to support simulcast and/or scalable video coding (SVC).

While some developers might be interested in controlling exactly how simulcast and SVC are used in their applications, other developers would probably be happier if they could leave that to the browser. Looking at the current specification, it appears to me that SVC configuration within an RTCRtpReceiver might be unnecessary for some video codecs, and in addition, it might be possible to dispense with configuring the SVC configuration within an RTCRtpSender in some cases as well.

Below is my understanding of how "automatic" use of SVC works on the RTCRtpReceiver and how it might work on an RTCRtpSender. Comments/corrections/suggestions welcome.

In situations where a compliant decoder can decode any valid encoding, it would appear to me that it is not necessary to set up the SVC configuration within RTCRtpReceiver.receiver. Essentially, the decision whether to utilize scalable video coding can be left to the sender. If the receiver can handle anything that the sender can send, there isn't even a need for negotiation, such as an exchange of capabilities.

To give a practical example, if a VP8 decoder can decode any valid VP8 encoding, including temporal scalability, it seems to me that an RTCRtpReceiver would not need to configure an SVC layer configuration within RTCRtpEncodingParameters. In the event that a layering configuration is provided (e.g. two temporal layers are expected) the RTCRtpSender should still be free to send something else (e.g. maybe only 1 temporal layer, or perhaps 3) without a resulting error. So it seems to me that for the RTCRtpReceiver, configuration of SVC layering is somewhat extraneous. Also, I'm not clear about the usefulness of having RTCRtpReceiver.getCapabilities return a value for RTCRtpCodecCapability.maxTemporalLayers. Where maxTemporalLayers is not set, the default interpretation could be "I can handle the maximum temporal layers supported by the codec."

For an RTCRtpSender, it does seem useful for the developer to be able to indicate whether to use temporal scalability or not. For example, in peer-to-peer communication the overhead of SVC might not make sense, so it might be useful to be able to specify only a single layer in RTCRtpSender.send(). On the other hand, there might be situations where the developer would just as soon leave the decision to use SVC up to the browser. Rather than trying to adjust the number of temporal layers within the application, the browser could decide how many layers might make sense.

Currently within RTCRtpEncodingParameters, it doesn't appear that a developer can indicate to the browser "Send SVC if you think it is useful". For example, within the RTCRtpEncodingParameters dictionary there is no "maxTemporalLayers" attribute. All you have is encodingId and a sequence of dependencyEncodingIds.

dictionary RTCRtpEncodingParameters {
unsigned long ssrc;
payloadtype codecPayloadType;
RTCRtpFecParameters fec;
RTCRtpRtxParameters rtx;
double priority = 1.0;
double maxBitrate;
double minQuality = 0;
double framerateBias = 0.5;
double resolutionScale;
double framerateScale;
boolean active = true;
DOMString encodingId;
sequence<DOMString> dependencyEncodingIds;
};

Received on Wednesday, 10 September 2014 22:01:27 UTC