- From: Henrik Boström <hbos@google.com>
- Date: Tue, 5 Sep 2017 11:44:08 +0200
- To: public-webrtc@w3.org
- Message-ID: <CAEbRw2xeO2V1_D6VBjJ4Jg0jJOwmQiZAfojk-K6-DaZKANCMWA@mail.gmail.com>
Hi, I've written a PR (#215 <https://github.com/w3c/webrtc-stats/pull/215>) for the Identifiers for WebRTC's Statistics API <https://w3c.github.io/webrtc-stats/> which has gone stale. The suggestion is to add a getStats() <https://w3c.github.io/webrtc-pc/#dom-rtcpeerconnection-getstats()> metric for audio tracks (RTCMediaStreamTrackStats). There already exists totalSamplesReceived and concealedSamples <https://w3c.github.io/webrtc-stats/#dom-rtcmediastreamtrackstats-concealedsamples>, which is *"The total number of inbound audio samples that are concealed samples. A concealed sample is a sample that is based on data that was synthesized to conceal packet loss and does not represent incoming data."* Concealment can occur due to packet loss when someone is speaking, or it may occur to insert "silent" packets to the stream if packet loss occurs when the stream is silent or background noise. T To differentiate, the suggested new metric is concealedAudibleSamples: *Only present for inbound audio tracks. The total number of concealed audio samples (see concealedSamples) that was played out during an audible portions of the stream. Audible means that the received audio is not considered background noise or silence by the user agent. It is up to the implementation to determine what is considered background noise, but concealments of audible samples SHOULD in general have a greater impact on user experience than concealment of non-audible samples. If the voice activity flag is present in RTP packets as per [[RFC6464]] this MAY be used to indicate audibility. Audibility MAY also be based on audio levels or more sophisticated analysis of the stream.* The problem with this metric is that there is no standards way to determine what is or is not considered background noise so it would be implementation-specific. This implies a risk if different browsers implement it to mean different things. Still, the definition gives some guidance to what it is supposed to mean and it would be useful when analyizing call quality if concealment events occurred during "audible" or "inaudible" portions of the stream, even if this involves some guesswork on part of the implementation. Does anyone have an opinion about this? Feel free to comment on the PR. Cheers, /Henrik (henbos on github)
Received on Tuesday, 5 September 2017 09:44:37 UTC