RE: Issue #4 (Need API to read the CSRC on received tracks) and Issue #6 (Need API to receive mixer to client audio level information for a track)

Here is an updated proposal containing clarifications for the time and RTCRtpContributingSource questions.   There is still an open question relating to the time interval. 

partial interface RTCRtpReceiver {
    sequence<RTCRtpContributingSource> getContributingSources ();
};

getContributingSources

Returns an RTCRtpContributingSource object for each unique CSRC or SSRC received by this RTCRtpReceiver since {the last time this method was called || last 10 seconds  || last second   }. 
No parameters.
Return type: sequence<RTCRtpContributingSource>

dictionary RTCRtpContributingSource

The RTCRtpContributingSource object contains information about a contributing source. Each time an RTP packet is received, the RTCRtpContributingSource objects are updated. If the RTP packet contains CSRCs, then the RTCRtpContributingSource objects corresponding to those CSRCs are updated, and the level values for those CSRCs are updated based on the mixer-client header extension [RFC6565] if present. If the RTP packet contains no CSRCs, then the RTCRtpContributingSource object corresponding to the SSRC is updated, and the level value for the SSRC is updated based on the client-mixer header extension [RFC6464] if present. If the RTP packet does not contain a client-to-mixer header extension, then the browser will compute the level value as described in [RFC6464] and will provide that.
interface RTCRtpContributingSource {
    readonly    attribute DOMHiResTimeStamp timestamp;
    readonly    attribute unsigned long     source;
    readonly    attribute byte?             audioLevel;
};

6.4.1 Attributes
audioLevel of type byte, readonly , nullable
The audio level contained in the last RTP packet received from this source. If the source was set from an SSRC, this will be the level value in [RFC6464]. If the source was set from a CSRC, this will be the level value in [RFC6465]. Both [RFC6464] and [RFC6465] define the level as an integral value from 0 to -127 representing the audio level in decibels relative to the loudest signal that they system could possibly encode.
source of type unsigned long, readonly 
The CSRC or SSRC value of the contributing source.
timestamp of type DOMHiResTimeStamp, readonly 
Time of reception of the most recent RTP packet containing the contributing source. The time (obtained on the user agent) is relative to the UNIX epoch (Jan 1, 1970, UTC).

-----Original Message-----
From: Bernard Aboba [mailto:Bernard.Aboba@microsoft.com] 
Sent: Saturday, September 19, 2015 1:45 PM
To: Harald Alvestrand <harald@alvestrand.no>; public-webrtc@w3.org
Subject: RE: Issue #4 (Need API to read the CSRC on received tracks) and Issue #6 (Need API to receive mixer to client audio level information for a track)

Some questions: 

1. What is the purpose of the timestamp? 

If the RTCRtpContributingSource dictionary only provides information on RTP packets received in the last second this isn't clear to me.  Do we really care how many ms ago the last RTP packet arrived?   

The timestamp was originally put in when the RTCRtpContributingSource dictionary contained only source and timestamp attributes, and the dictionary was to provide information on sources that might have been received over a much longer interval (e.g. since the RtpReceiver was created).  In that formulation, the timestamp enabled the application to filter out sources that had not spoken recently (with the interval defined by the application).  

2. Does it really make sense to include both SSRC and CSRC RTCRtpContributingSource dictionary entries? 

In the case with a centralized mixer, the goal is to provide information on contributing sources and levels taken from the mixer-client (RFC 6465) extension.  The overall (mixed) audio level (calculated as per RFC 6464) and the SSRC does not seem relevant, since this relates to the mixer, and not to an individual participant.  

In the case of a P2P mesh, the SSRC and client-mixer extension (RFC 6464) provides the information that can be used in the UI.  In that case there are no CSRCs or mixer-client header extensions, since there is no mixer. 


________________________________________
From: Harald Alvestrand [harald@alvestrand.no]
Sent: Sunday, September 13, 2015 1:08 PM
To: public-webrtc@w3.org
Subject: Re: Issue #4 (Need API to read the CSRC on received tracks) and   Issue #6 (Need API to receive mixer to client audio level information for  a track)

On 09/11/2015 12:37 PM, Martin Thomson wrote:
> On 11 September 2015 at 12:07, Bernard Aboba 
> <Bernard.Aboba@microsoft.com> wrote:
>
<snip>
> 3.    How is the timestamp value obtained?  Is this the timestamp value from
> the RTP packet?  Or is it an NTP timestamp?
> That's a good question.  And part of why I wasn't particularly happy 
> with the timestamp.  HighResTimestamp is usually something that is 
> local to the browser, so some translation would be required if the 
> time was based on the NTP time.  That said, the NTP time from the 
> packet isn't going to relate well to the playback time, but it might 
> be the best option, since packet arrival time doesn't really match to 
> anything of use.  Do we have any way to determine what the difference 
> is between the NTP times used in RTP and the playback time?
The RTP timestamp value is a reasonable choice because it is:
- Defined for all packets
- Universally known to not have global meaning

The second alternative is the NTP translation of the RTP timestamp.
This is:
- Only defined for packets for which an RTCP SR has been received and parsed
- Related to the global "NTP time" through the sender's clock configuration - which means that it will have its obvious meaning *most* of the time (not a good thing).

The third alternative is that it's the HighResTimestamp of the time the browser thinks it received the packets. This is:
- Defined for all packets
- Related to the local system clock, which makes it consistent with most APIs the client is likely to access
- Somewhat variable offset from the time the sender thinks it sent the packet - so it's not likely to be meaningful to send it to other clients (if there was ever a reason for that).

I'm leaning towards going with the third option. It's simplest to implement.



--
Surveillance is pervasive. Go Dark.

Received on Saturday, 19 September 2015 21:48:17 UTC