W3C home > Mailing lists > Public > public-media-capture@w3.org > May 2015

Re: Specifying the audio buffer size

From: Charlie Kehoe <ckehoe@google.com>
Date: Mon, 04 May 2015 22:00:48 +0000
Message-ID: <CAGNr40qe3_N+afzDp746QNvWyuXni2Y6qQTeBJJneU1nqEXsqQ@mail.gmail.com>
To: Justin Uberti <juberti@google.com>, Harald Alvestrand <harald@alvestrand.no>
Cc: "public-media-capture@w3.org" <public-media-capture@w3.org>
Any additional thoughts here? The May 15th deadline is not too far away.


On Wed, Apr 22, 2015 at 4:59 PM Justin Uberti <juberti@google.com> wrote:

>
>
> On Tue, Apr 21, 2015 at 4:42 AM, Harald Alvestrand <harald@alvestrand.no>
> wrote:
>
>> Den 21. april 2015 02:32, skrev Charlie Kehoe:
>> > Some applications involve listening to audio for a potentially extended
>> > period of time (with user consent, of course), and are not particularly
>> > latency-sensitive. An example would be the "Ok Google" hotwording
>> > available on the Chrome new tab page, or other types of continuous
>> > speech recognition. For these applications, a typical low-latency audio
>> > configuration can lead to excessive power usage. I've measured 20% CPU
>> > usage for audio capture in Chrome, for example.
>> >
>> > My proposed solution is to offer a way to change the audio buffer size.
>> > This enables a tradeoff between latency and power usage. For example, a
>> > member could be added to MediaTrackConstraintSet
>>
> > <
>> http://w3c.github.io/mediacapture-main/getusermedia.html#dictionary-mediatrackconstraintset-members
>> >:
>
>
>> >
>> > dictionary MediaTrackConstraintSet {
>> >    ...
>> >    audioBufferDurationMs of type ConstrainLong
>> > };
>> >
>> > This would be an integer number of milliseconds. Perhaps the name could
>> > mention latency instead (e.g. audioLatencyMs).
>> >
>> > How does this simple change sound?
>>
>> I'd prefer to actually look at where the thing is connected, and do the
>> configuration there.
>>
>> If it goes to a MediaStreamRecorder, that already has all the
>> information needed (chunk size).
>> If it goes to a PeerConnection, buffering may belong there, but I'm not
>> sure how to represent it, or where it makes sense
>> (.permissibleBufferDelay on an RTPSender? Perhaps..)
>>
>
> One could argue that this would also apply for things like resolution or
> sample rate, yet we allow those things to be specified as inputs to gUM.
>
> Part of the reason for this is that you have the track before you wire it
> up to something, which means you only learn about the downstream needs at
> some point in the future, so you may have to go reopen the device.
>
>
>>
>> If it's only which code path is chosen in the Google Chrome browser, I'd
>> prefer a constraint like "googLowLatencyPath = false"; this is an
>> implementation concern, not an architectural concern.
>>
>
> I don't think this is a code path question; it's a generic question of how
> often we should be reading data from the device, which applications could
> have vastly different, non-binary opinions on. An application that wants to
> do live music performance might choose to read the device every 2.5 ms, and
> send as 2.5 ms packetized Opus, whereas an application that is passively
> listening for commands might want to read the device 10x less often.
>
Received on Monday, 4 May 2015 22:01:17 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:26:33 UTC