W3C home > Mailing lists > Public > public-ortc@w3.org > December 2015

Re: Issue 263: Contributing source(s) missing "voice" activity flag

From: Bernard Aboba <Bernard.Aboba@microsoft.com>
Date: Sat, 26 Dec 2015 23:58:03 +0000
To: "public-ortc@w3.org" <public-ortc@w3.org>
Message-ID: <BLUPR03MB1492AACDFC576D9C575F4CDECF90@BLUPR03MB149.namprd03.prod.outlook.com>
I think there are two issues here: 

1.  How does an RtpSender turn on the "V" bit? Currently RTCRtpHeaderExtensionParameters has no support for extension mechanisms such as the "vad" extension (which can be "on" or "off" according to RFC 6464 Section 4).  In a situation where the SFU discards packets without the "V" bit set (since they don't include voice there is no need to forward them), it is necessary for the browser to be able to set the "vad" extension to "on". 

Do we need to add parameters to RTCRtpHeaderExtensionParameters to allow header extension parameters such as the "vad" parameter to be set?  For example: 

partial dictionary RTCRtpHeaderExtensionParameters {
     Dictionary                parameters;

Similarly, do we need to add capabilities to RTCRtpHeaderExtension to indicate what header extension parameters are supported? 

partial dictionary RTCRtpHeaderExtension {
       Dictionary                parameters;

2. Is there value in adding support for the "V" bit to the RTCRtpContributingSource dictionary?

Since the "V" bit does not exist in RFC 6465 (mixer to client extension), this question only arises for the peer-to-peer case, where the client-mixer extension (RFC 6464) is used. 

If the use case is to set a level indicator, I don't think the "V" bit is valuable - regardless of whether the bit is on/off, the browser would just use the audioLevel value to indicate the level of energy coming from that peer. 

Robin Raymond filed Issue 263: 

There is audio level and csrc/ssrc but no flag indicating voice which is contained within the packet.

NOTE: This is only contained in "client to mixer" extension so maybe it should not be added but peer to peer this would be available so there might be some value in exposing this value to the programmer. Could be "unset" for "mixer to client" obtained values and "set" for "client to mixer" obtained values (i.e. when value arrives peer to peer).


>From RFC 6464 Section 3: 

In addition, a flag bit (labeled "V") optionally indicates whether
 the encoder believes the audio packet contains voice activity. If
 the V bit is in use, the value 1 indicates that the encoder believes
 the audio packet contains voice activity, and the value 0 indicates
 that the encoder believes it does not. (The voice activity detection
 algorithm is unspecified and left implementation-specific.) If the V
 bit is not in use, its value is unspecified and MUST be ignored by
 receivers. The use of the V bit is signaled using the extension
 attribute "vad", discussed in Section 4.

>From RFC 6464 Section 4: 

The URI for declaring this header extension in an extmap attribute is

It has a single extension attribute, named "vad". It takes the form
 "vad=on" or "vad=off". If the header extension element is signaled
 with "vad=on", the V bit described in Section 3 is in use, and MUST
 be set by senders. If the header extension element is signaled with
 "vad=off", the V bit is not in use, and its value MUST be ignored by
 receivers. If the vad extension attribute is not specified, the
 default is "vad=on".

An example attribute line in the Session Description Protocol (SDP)
 for a conference might hence be:
  a=extmap:6 urn:ietf:params:rtp-hdrext:ssrc-audio-level vad=on

The vad extension attribute only controls the semantics of this
 header extension attribute, and does not make any statement about
 whether the sender is using any other voice activity detection
 features, such as discontinuous transmission, comfort noise, or
 silence suppression.
Received on Saturday, 26 December 2015 23:58:36 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:39:57 UTC