Re: active speaker information in mixed streams from Justin Uberti on 2014-01-29 (public-orca@w3.org from January 2014)

From: Justin Uberti <juberti@google.com>
Date: Tue, 28 Jan 2014 17:51:12 -0800
To: Peter Thatcher <pthatcher@google.com>
Cc: Roman Shpount <rshpount@turbobridge.com>, Emil Ivov <emcho@jitsi.org>, "public-orca@w3.org" <public-orca@w3.org>
Message-ID: <CAOJ7v-2pdKEEzvZn62t6mmp-FSEbzhwEW2UzWySOPaPC_BVngg@mail.gmail.com>
ContributingSource is probably clearer.


On Tue, Jan 28, 2014 at 5:32 PM, Peter Thatcher <pthatcher@google.com>wrote:

> Polling is fine with me.  What about calling it RtpContriburingSource?
>  Do you prefer that or MixerInfo?
>
> On Tue, Jan 28, 2014 at 5:29 PM, Justin Uberti <juberti@google.com> wrote:
> > I don't think it needs to be an event. Just poll it at the frequency you
> > care about.
> >
> >
> > On Tue, Jan 28, 2014 at 5:26 PM, Peter Thatcher <pthatcher@google.com>
> > wrote:
> >>
> >> On Tue, Jan 28, 2014 at 5:21 PM, Roman Shpount <
> rshpount@turbobridge.com>
> >> wrote:
> >> > Would it make more sense to generalize a RtpContributingSource to
> define
> >> > a
> >> > list of RTP header extensions and trigger an event every time the
> value
> >> > set
> >> > changes:
> >> >
> >> > dictionary RtpHeaderExtension {
> >> >   unsigned short id;
> >> >   ArrayBuffer value;
> >> > }
> >> >
> >> > dictionary RtpContributingSource {
> >> >   unsigned int csrc;
> >> >   sequence<RtpHeaderExtension> headerExtensions;
> >> > }
> >> >
> >> > This way it is not limited to audio level only.
> >> >
> >>
> >> Like Justin said, it's getting quite low-level at that point.  It's
> >> not much different than my "give JS access to every packet" event.
> >>
> >> > This being said, the only problem I see with all of this is that there
> >> > are
> >> > scenarios (like audio level) when this event will be triggered for
> every
> >> > packet. This will not scale for server side applications of orca.
> >> >
> >>
> >> Since we only care about the latest values, can't we just throttle how
> >> often the event is fired?  Say, every 200ms?
> >>
> >> > _____________
> >> > Roman Shpount
> >> >
> >> >
> >> > On Tue, Jan 28, 2014 at 7:56 PM, Peter Thatcher <pthatcher@google.com
> >
> >> > wrote:
> >> >>
> >> >> Yes, it's pretty low-level.  For this particular use case, what you
> >> >> have is better, although I'm not sure I'd like calling it
> "MixerInfo".
> >> >>  How about just calling them "contributing source"s?
> >> >>
> >> >> dictionary RtpContributingSource {
> >> >>   unsigned int csrc;
> >> >>   int audioLevel;
> >> >> }
> >> >>
> >> >> partial interface RtpReceiver {
> >> >>   sequence<RtpContributingSource> getContributingSources();
> >> >> }
> >> >>
> >> >>
> >> >> Also, is it enough to require JS to poll?  Why not have an event for
> >> >> when the values change?
> >> >>
> >> >> partial interface RtpReceiver {
> >> >>    // Gets sequence<RtpContributingSource>
> >> >>    attribute EventHandler? oncontributingsources;
> >> >> }
> >> >>
> >> >>
> >> >> Even so, would it still be worth it to have low-level header
> extension
> >> >> access?  It might be handy when an application wants a proprietary
> >> >> header extension sent from their "mixer".  On the other hand, one
> >> >> could probably just use the data channel, like I suggested earlier
> :).
> >> >>
> >> >> By the way, the ease at which you put this on the RtpReceiver does
> >> >> show what an advantage it is to have it.
> >> >>
> >> >>
> >> >> On Tue, Jan 28, 2014 at 4:21 PM, Justin Uberti <juberti@google.com>
> >> >> wrote:
> >> >> > Having to mine through the raw packets feels like a pretty
> low-level
> >> >> > API
> >> >> > to
> >> >> > me.
> >> >> >
> >> >> > I was thinking that one could interrogate the RtpReceiver object to
> >> >> > get
> >> >> > data
> >> >> > on the most recently seen CSRCs and their corresponding energy
> >> >> > levels.
> >> >> > Something like
> >> >> >
> >> >> > dictionary RtpCsrcInfo {
> >> >> >   unsigned int csrc;
> >> >> >   int audioLevel;
> >> >> > }
> >> >> >
> >> >> > dictionary RtpMixerInfo {
> >> >> >   sequence<RtpCsrcInfo> csrcs;
> >> >> > }
> >> >> >
> >> >> > partial interface RtpReceiver {
> >> >> >   RtpMixerInfo getMixerInfo();
> >> >> > }
> >> >> >
> >> >> > or maybe just return a dictionary with CSRC as keys and energy
> levels
> >> >> > as
> >> >> > values.
> >> >> >
> >> >> >
> >> >> > On Tue, Jan 28, 2014 at 3:27 PM, Peter Thatcher
> >> >> > <pthatcher@google.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> I think it would be reasonable to add some access to header
> >> >> >> extensions
> >> >> >> and CSRCs in the RtpReceiver object.
> >> >> >>
> >> >> >>
> >> >> >> Would it make sense to have a general access to such things by
> >> >> >> having
> >> >> >> general access to receive packets?  It could be used like so:
> >> >> >>
> >> >> >> var receiver = new RtpReceiver(...);
> >> >> >> receiver.onpackets = function(packets) {
> >> >> >>   for (var i = 0; i < packets.length; i++) {
> >> >> >>     var packet = packets[i];
> >> >> >>     // Here you have access to
> >> >> >>     // packet.csrcs
> >> >> >>     // packet.headerExtensions
> >> >> >>   }
> >> >> >> }
> >> >> >>
> >> >> >> And defined like so:
> >> >> >>
> >> >> >> partial interface RtpReceiver {
> >> >> >>   // Gives a sequence of RtpPacket
> >> >> >>   // Fired in "batches" of packets.
> >> >> >>   attribute EventHandler? onpackets;
> >> >> >> }
> >> >> >>
> >> >> >> dictionary RtpPacket {
> >> >> >>   sequence<unsigned int> csrcs;
> >> >> >>   sequence<RtpHeaderExtension> headerExtensions;
> >> >> >> }
> >> >> >>
> >> >> >> dictionary RtpHeaderExtension {
> >> >> >>   unsigned short id;
> >> >> >>   ArrayBuffer value;
> >> >> >> }
> >> >> >>
> >> >> >>
> >> >> >> That might leave a bit of work for you to build on top of, but it
> >> >> >> would solve the "can I access header extension" issue once and for
> >> >> >> all.
> >> >> >>
> >> >> >> Would this meet your needs?
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> On Tue, Jan 28, 2014 at 2:51 PM, Emil Ivov <emcho@jitsi.org>
> wrote:
> >> >> >> > On Tue, Jan 28, 2014 at 11:41 PM, Peter Thatcher
> >> >> >> > <pthatcher@google.com>
> >> >> >> > wrote:
> >> >> >> >> I guess it could continue in both.  The ORCA  CG might be
> quicker
> >> >> >> >> to
> >> >> >> >> integrate something into the API than the WebRTC WG.
> >> >> >> >>
> >> >> >> >> My question is the same: exactly what info do you want
> available
> >> >> >> >> in
> >> >> >> >> the JS?  The CSRCs?
> >> >> >> >
> >> >> >> > Same answer then: That would be CSRCs and/or audio level header
> >> >> >> > extensions as per RFC6465.
> >> >> >> >
> >> >> >> > Emil
> >> >> >> >
> >> >> >> > --
> >> >> >> > https://jitsi.org
> >> >> >> >
> >> >> >> >> On Tue, Jan 28, 2014 at 2:38 PM, Emil Ivov <emcho@jitsi.org>
> >> >> >> >> wrote:
> >> >> >> >>> I am not sure whether this discussion should only continue on
> >> >> >> >>> one
> >> >> >> >>> of
> >> >> >> >>> the lists but until we figure that out I am going to answer
> here
> >> >> >> >>> as
> >> >> >> >>> well
> >> >> >> >>>
> >> >> >> >>> Sync isn't really the issue here. It's mostly about the fact
> >> >> >> >>> that
> >> >> >> >>> the
> >> >> >> >>> mixer is not a WebRTC entity. This means that it most likely
> >> >> >> >>> doesn't
> >> >> >> >>> even know what SCTP is, it doesn't necessarily have access to
> >> >> >> >>> signalling and above all, the mix is likely to also contain
> >> >> >> >>> audio
> >> >> >> >>> from
> >> >> >> >>> non-webrtc endpoints. Using DataChannels in such situations
> >> >> >> >>> would
> >> >> >> >>> likely turn out to be quite convoluted.
> >> >> >> >>>
> >> >> >> >>> Emil
> >> >> >> >>>
> >> >> >> >>> On Tue, Jan 28, 2014 at 10:18 PM, Peter Thatcher
> >> >> >> >>> <pthatcher@google.com> wrote:
> >> >> >> >>>> Over there, I suggested that you could simply send the audio
> >> >> >> >>>> levels
> >> >> >> >>>> over an unordered data channel.  If you're using one
> >> >> >> >>>> IceTransport/DtlsTransport pair for both your RTP and SCTP,
> it
> >> >> >> >>>> would
> >> >> >> >>>> probably stay very closely in sync.
> >> >> >> >>>>
> >> >> >> >>>> On Tue, Jan 28, 2014 at 5:44 AM, Emil Ivov <emcho@jitsi.org>
> >> >> >> >>>> wrote:
> >> >> >> >>>>> Hey all,
> >> >> >> >>>>>
> >> >> >> >>>>> I just posted this to the WebRTC list here:
> >> >> >> >>>>>
> >> >> >> >>>>>
> >> >> >> >>>>>
> >> >> >> >>>>>
> http://lists.w3.org/Archives/Public/public-webrtc/2014Jan/0256.html
> >> >> >> >>>>>
> >> >> >> >>>>> But I believe it's a question that is also very much worth
> >> >> >> >>>>> resolving
> >> >> >> >>>>> for ORTC, so I am also asking it here:
> >> >> >> >>>>>
> >> >> >> >>>>> One requirement that we often bump against is the
> possibility
> >> >> >> >>>>> to
> >> >> >> >>>>> extract active speaker information from an incoming *mixed*
> >> >> >> >>>>> audio
> >> >> >> >>>>> stream. Acquiring the CSRC list from RTP would be a good
> >> >> >> >>>>> start.
> >> >> >> >>>>> Audio
> >> >> >> >>>>> levels as per RFC6465 would be even better.
> >> >> >> >>>>>
> >> >> >> >>>>> Thoughts?
> >> >> >> >>>>>
> >> >> >> >>>>> Emil
> >> >> >> >>>
> >> >> >> >>> --
> >> >> >> >>> https://jitsi.org
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > --
> >> >> >> > Emil Ivov, Ph.D.                       67000 Strasbourg,
> >> >> >> > Project Lead                           France
> >> >> >> > Jitsi
> >> >> >> > emcho@jitsi.org                        PHONE: +33.1.77.62.43.30
> >> >> >> > https://jitsi.org                       FAX:
> +33.1.77.62.47.31
> >> >> >>
> >> >> >
> >> >>
> >> >
> >
> >
>
Received on Wednesday, 29 January 2014 01:51:59 UTC