Re: getDisplayMedia() and system audio from Henrik Boström on 2018-09-06 (public-media-capture@w3.org from September 2018)

From: Henrik Boström <hbos@google.com>
Date: Thu, 6 Sep 2018 12:07:00 +0200
To: silviapfeiffer1@gmail.com
Cc: roman@telurix.com, Martin Thomson <martin.thomson@gmail.com>, Bernard Aboba <Bernard.Aboba@microsoft.com>, public-webrtc@w3.org, public-media-capture@w3.org
Message-ID: <CAEbRw2yqmpzsJ2BYgC=K++LLkQ80q3892dNLMVtOr1qFkkvd9A@mail.gmail.com>

WebAudio allows you to add audio tracks together, so if you have
getDisplayMedia-audio and your microphone-audio you can mix the two
together.
My reasoning is anything that your application is playing out it should
"know about", so you can do any pick-and-choose of what audio to include or
not, and you don't have to add the remote audio tracks to it. (By the same
reasoning you might be able to remove any remote audio tracks from the
system audio to do the filtering yourself, but I don't want to go there
because it gets complicated, I wouldn't trust everything to be perfectly
aligned, and it would take a lot of unnecessary resources to do so. Better
that getDisplayMedia supports the most common use case out-of-the-box.)

As for getting the audio of a specific application I think this would
either be a privacy concern (we should not expose to JS what applications
are running on the system) or a complicated/confusing request for the user
to choose (pick-and-choose specific applications), not to mention the
complexity of the implementation to support this edge case. In any case,
why would you play out any audio on the speaker that you don't want to
share (other than avoiding the echo problem, which is already covered by
the proposal)?

Would the following work for you, Silvia?

let microphone = await navigator.mediaDevices.getUserMedia(audio:true);
// Includes iTunes and any other speaker sound, except for sounds playing
out by the application (e.g. remote participants).
let screenShare = await
navigator.mediaDevices.getDisplayMedia(audio:{excludeTabAudio:true},video:true);
let mixedStream = (WebAudio code for "microphone + screenShare");
// Send mixedStream to the remote endpoint(s).

Alternatively you can send both microphone and screenShare to the remote
endpoint(s) and play them out at the same time, not doing any intermediate
mixing. Avoiding the WebAudio step is probably good for resources, I'm
guessing.

On Thu, Sep 6, 2018 at 11:50 AM Silvia Pfeiffer <silviapfeiffer1@gmail.com>
wrote:

> Nice document!
>
> I have a concrete use case that's a little different:
> * do a screenshare of an application that has its own audio
> * add to that the audio from a music player (iTunes)
> * add to that the voice of the local person
> * send that to the other side
> * make sure not to also capture the rest of the system audio,
> particularly the voice of the far end
>
> We actually have such a use case right now by a PTSD psychologist who
> needs some specific audio and application to play together.
>
> Cheers,
> Silvia.
>
> On Thu, Sep 6, 2018 at 7:08 PM Henrik Boström <hbos@google.com> wrote:
> >
> > I listed some use cases and came up with a proposal, please take a look:
> > getDisplayMedia() with audio: excludeTabAudio constraint proposal
> >
> > On Thu, Sep 6, 2018 at 2:54 AM Roman Shpount <roman@telurix.com> wrote:
> >>
> >> On Wed, Sep 5, 2018 at 8:22 PM, Martin Thomson <
> martin.thomson@gmail.com> wrote:
> >>>
> >>> On Thu, Sep 6, 2018 at 10:11 AM Bernard Aboba
> >>>
> >>> There is probably some role for consent there, but I agree that this
> >>> should be as narrow as the selection.  To vary from that would require
> >>> some exceptional consent, in an area that is already a minefield.
> >>
> >>
> >> There is also an implementation minefield since there is no one-to-one
> correlation between audio sources and windows. Audio is played by the
> process and its relationship to a specific window is not always obvious.
> Even if you limit audio capture to a specific process, it is not trivial to
> capture it. There are no standard APIs for this. Audio capture from a
> process on most of the operating systems requires some sort of hack, like
> dynamic library injection. The most reliable method is to install a special
> driver that creates a virtual output device which maps to a virtual input
> device. Then the application needs to be configured to play into the
> virtual output driver and virtual input driver can be used as WebRTC audio
> source. One example of such setup are Virtual Audio Cables (
> https://en.wikipedia.org/wiki/Virtual_Audio_Cable). This means that for
> people who needs this now, there is a work around already available.
> Implementing something nice with an interface, which is safe to obtain
> consent and be used in a browser, is probably not currently possible.
> >>
> >> Regards,
> >> _____________
> >> Roman Shpount
> >>
> >>
>

Received on Thursday, 6 September 2018 10:07:35 UTC