W3C home > Mailing lists > Public > public-audio@w3.org > April to June 2015

Re: Web Audio WG feedback LC-3023 (Re: Media Capture and Streams Last Call review)

From: Chris Wilson <cwilso@google.com>
Date: Mon, 18 May 2015 12:47:52 -0700
Message-ID: <CAJK2wqUs1iCwPWFJ5GN+_BW=FtZQdBULHx+w9WSEONv9KX+GRQ@mail.gmail.com>
To: Joe Berkovitz <joe@noteflight.com>
Cc: "Hofmann, Bill" <bill.hofmann@dolby.com>, Audio Working Group <public-audio@w3.org>
Note it's not just "as a constraint", per se - I want to make sure we can
enumerate the channel counts (etc.)

On Mon, May 18, 2015 at 12:47 PM, Chris Wilson <cwilso@google.com> wrote:

> SGTM.
>
> On Mon, May 18, 2015 at 10:52 AM, Joe Berkovitz <joe@noteflight.com>
> wrote:
>
>> Thanks Bill and Chris for the additional thoughts.  I also think we need
>> a little more clarity about how enumerating devices with enumerateDevices()
>> is supposed to work, too.
>>
>> Look at the fine print (step #4) here:
>>
>> http://w3c.github.io/mediacapture-main/#dom-mediadevices-enumeratedevices
>>
>> It appears that if the user doesn't grant permission to *at least one
>> device* in the result returned by enumerateDevices() (which doesn't take a
>> query constraint argument) , then the returned list doesn't include any
>> names for the devices -- they get censored by a filtering step. This is
>> presumably an anti-fingerprinting measure: there needs to be some UA/user
>> interaction before a site's scripts can get access to that list of devices.
>>
>> If this behavior is taken as given -- and I think it may be hard to argue
>> otherwise -- then it appears the only workable approach is this:
>>
>> 1. Call getUserMedia() with some constraints, to try to get permission to
>> at least one default output device. (Presumably the app supplies a set of
>> constraints that were tuned to choose a reasonable default.)
>>
>> 2. If permission is granted by the user, call enumerateDevices() to
>> obtain a full, user-readable list of device names for output devices.
>>
>> 3. In whatever device-choice UI is offered by the app, display the full
>> list of device names from step #2 to the user, defaulted to the device
>> chosen by step #1.
>>
>> I guess the good news here is that the constraints to getUserMedia() are
>> relatively powerful. Sample rate is already a constraint attribute and
>> latency is under consideration. If MCTF will include channel count (which
>> seems pretty uncontroversial, and fingerprinting is not really a
>> consideration here), perhaps that suffices to get us off the ground.
>>
>> I don't think it's too awful for the enumerated device list to be
>> relatively unconstrained in nature. As long as the default one is a
>> reasonable choice the user can do an OK job of picking an alternative given
>> the set of valid choices.
>>
>> So my proposal is to get back to the group and suggest the inclusion of
>> channel count as a constraint, if I don't hear objections.
>>
>> Thoughts?
>>
>>
>> ...Joe
>>
>>
>>
>> On Mon, May 18, 2015 at 12:14 PM, Chris Wilson <cwilso@google.com> wrote:
>>
>>> I think number of channels and sample rate are the most critical.  Next
>>> up would be latency and "binaural delivery" - aka "headphones" - as that
>>> can indicate that HRTF, etc are appropriate (although I'd point out that
>>> attribute can change without affecting the rest of the device, so maybe
>>> it's a separate mechanism?).  I think HDMI is a red herring.
>>>
>>> On Sat, May 16, 2015 at 12:07 PM, Hofmann, Bill <bill.hofmann@dolby.com>
>>> wrote:
>>>
>>>>  Joe:
>>>>
>>>>
>>>>
>>>> Thanks for your notes on this.  When I think about use cases:
>>>>
>>>>
>>>>
>>>> 1.       A user wants to connect their device (e.g., a digital media
>>>> adapter) to an AV Receiver so they can play a game and take advantage of
>>>> their surround system. DMAs are starting to also be game consoles now, many
>>>> in China and most recently NVIDIA’s new device.  No reason why they
>>>> shouldn’t support HTML games, and HTML is often the UI for these devices
>>>>
>>>> 2.       A user wants to play a game with a headset – knowing that the
>>>> device is connected to a headset jack at least would allow a game to do a
>>>> headphone render
>>>>
>>>> 3.       A user wants to watch a movie, and the HTML player wants to
>>>> adapt the audio properly based on the rendering device
>>>>
>>>>
>>>>
>>>> It’s most likely, to me at least, that the user would chose the device
>>>> to render to, **though**, you’d really want the default choice to be
>>>> the “best one”.  So that does suggest that at the very least, you should be
>>>> able to:
>>>>
>>>> ·         Determine the number of outputs (if == 1, the choice is easy
>>>> J)
>>>>
>>>> ·         Identify the type of output (speaker, headphone, HDMI)
>>>>
>>>> ·         The number of channels
>>>>
>>>> without permission.
>>>>
>>>>
>>>>
>>>> Then, the first time (or if the configuration changes), the user would
>>>> be asked for permission to use the output device, and potentially be given
>>>> a list of choices beforehand based on the info above, which ought to be
>>>> enough.  It’s probably fine to get the rest of the characteristics later.
>>>> I don’t recall where getUserMedia ended up with respect to permissions –
>>>> it’d be deadly to have to configure each time you turn on your DMA or
>>>> launch a different app, but that doesn’t relate to this problem.
>>>>
>>>>
>>>>
>>>> I think the constraints approach is fine, but realize that people will
>>>> use that as a way of enumerating – if you ask for stereo-capable outputs,
>>>> for instance.  I don’t think you can count on always only getting one
>>>> output.  And agree on Chris’ concern.  The way you’d probably end up having
>>>> to code this if you wanted headphones but could deal with speakers (for
>>>> instance) would end up being a set of getUserMedia calls with constraints,
>>>> and taking the first.  Unless the constraint could be an OR.  I foresee a
>>>> need for guidance about the right way to code this sort of thing.
>>>>
>>>>
>>>>
>>>> -Bill
>>>>
>>>>
>>>>
>>>> *From:* Joe Berkovitz [mailto:joe@noteflight.com <joe@noteflight.com>]
>>>> *Sent:* Friday, May 15, 2015 8:25 AM
>>>> *To:* Audio Working Group
>>>> *Subject:* Re: Web Audio WG feedback LC-3023 (Re: Media Capture and
>>>> Streams Last Call review)
>>>>
>>>>
>>>>
>>>> Before responding to Harald, I'd like to solicit some discussion within
>>>> the Audio WG. I think the most important questions here are:
>>>>
>>>>
>>>>
>>>> 1. If we want to be able to find out properties of devices in an
>>>> enumerated list without requesting device access from the user, then what
>>>> is the absolute "must have" set of properties for Web Audio to include in
>>>> enumerateDevices() results? The more we ask for, the less likely we will
>>>> get them -- and some may be more likely to generate long debates than
>>>> others, like HDMI.
>>>>
>>>>
>>>>
>>>> 2. Do we need to enumerate devices, or is it OK for us to use
>>>> getUserMedia() with constraints on these properties, and then pass the
>>>> deviceID of the returned mediaStream -- obtained with
>>>> mediastream.getCapabilities() --  that matches those constraints to an
>>>> AudioContext constructor? (As opposed to using
>>>> createMediaStreamDestination(mediaStream) which would have the various
>>>> sample rate issues raised by Chris).
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: *Harald Alvestrand* <harald@alvestrand.no>
>>>> Date: Fri, May 15, 2015 at 7:12 AM
>>>> Subject: Web Audio WG feedback LC-3023 (Re: Media Capture and Streams
>>>> Last Call review)
>>>> To: Joe Berkovitz <joe@noteflight.com>, Stefan Håkansson LK <
>>>> stefan.lk.hakansson@ericsson.com>, public-media-capture@w3.org, Audio
>>>> Working Group <public-audio@w3.org>
>>>>
>>>>
>>>> Hello, and thanks for your input!
>>>>
>>>> I'm seriously in two minds about this - on one hand, it seems like
>>>> functionality that is well worth having.
>>>>
>>>> On the other hand, it seems like a long list of things that could be of
>>>> interest here, and I can easily envision considerable time passing while
>>>> we discuss the details of each (for instance, if we expose the fact that
>>>> an output is HDMI, we also expose the fact that it's either crypto
>>>> capable or not crypto capable....)
>>>>
>>>> I think a lot of things can be addressed within the
>>>> capabilities/constraints/settings model we've adopted for getUserMedia -
>>>> one can define new constraints that get you the selectivity you want,
>>>> one can call getCapabilities() to figure out what kind of device one
>>>> has, one can use getSettings() to figure out what the current state of
>>>> play is. If so (and if the TF keeps the "registry" approach for
>>>> constraints), solving these problems can be as easy as authoring an
>>>> add-on document called "additional audio capabilities and constraints".
>>>>
>>>> But I'm not sure if that will cover all your needs, or if this is the
>>>> most elegant way of doing it - certainly some will make immediate note
>>>> that the constraints mechanism isn't what they consider elegant.
>>>>
>>>> What do you see as the best way forward here - aim to address this
>>>> later, or do we have parts of this problem that we *have* to address
>>>> now?
>>>>
>>>>             Harald
>>>>
>>>>
>>>> Den 21. april 2015 20:59, skrev Joe Berkovitz:
>>>> > Hello Stefan,
>>>> >
>>>> >
>>>> > Thank you for your recent solicitation of feedback to on the Media
>>>> > Capture and Streams API, which I passed to the Web Audio Working
>>>> Group.
>>>> >
>>>> >
>>>> > The Web Audio WG so far has identified one key item that we would like
>>>> > to see addressed. The MediaDeviceInfo result from enumerateDevices()
>>>> > (
>>>> http://www.w3.org/TR/2015/WD-mediacapture-streams-20150414/#idl-def-MediaDeviceInfo
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_TR_2015_WD-2Dmediacapture-2Dstreams-2D20150414_-23idl-2Ddef-2DMediaDeviceInfo&d=AwMFaQ&c=lI8Zb6TzM3d1tX4iEu7bpg&r=qzKCNHFKJMzZBJ52at1DkA-_8TPxvcij-zS_VXs8c5A&m=Ajygd3cU_M15NMeR6tSVYxwZiRtNw9yzWnTx0nK85QM&s=TljDhqSLOq4OTqBOCkRFi2iTNGDcdg0vbZ9A-vrOnlw&e=>
>>>> )
>>>> > lacks information that is typically available in the underlying OS
>>>> > implementations that we think would be very helpful for
>>>> implementations:
>>>> >
>>>> > __ __
>>>> >
>>>> > __·         __Channel count and configuration (Mono, Stereo, 5.1, 7.1,
>>>> > etc…)____
>>>> >
>>>> > __·         __Physical Output (Headphone, Speaker, HDMI, …)____
>>>> >
>>>> > __·         __Latency (this matters a lot for gaming -- it will be
>>>> very
>>>> > low for on-board hardware, perhaps quite high for wireless audio
>>>> > bridging like Apple TV)____
>>>> >
>>>> > __·         __Output capabilities (bitstream passthrough vs PCM –
>>>> > relevant in digital media adapter cases (Chromecast, etc))____
>>>> >
>>>> >
>>>> > It is perhaps sufficient from a user interface point of view to have a
>>>> > string to display, but for a program to be able to either adapt to the
>>>> > user selection or to guide and default the user selection, the above
>>>> are
>>>> > pretty important characteristics, at least in some use cases. Many if
>>>> > not most of the host OSes that user agents run on expose these sorts
>>>> of
>>>> > output device characteristics. ____
>>>> >
>>>> >
>>>> > Aside from the difficulty with enumerating devices, there is also
>>>> > perhaps a need to make it possible for applications to query the set
>>>> of
>>>> > available devices with respect to the above
>>>> > charateristics. MediaTrackConstraints and MediaTrackSettings do not
>>>> > currently  include constraint attributes that map to items in the
>>>> above
>>>> > list. And even if they do, arriving at a practical goodness-of-fit
>>>> > metric that can be generalized across a spectrum of audio apps may be
>>>> > difficult.
>>>> >
>>>> >
>>>> >
>>>> > The same concerns apply to the set of input devices.__
>>>> >
>>>> > __ __
>>>> >
>>>> > Please let us know if this issue makes sense to the group and can be
>>>> > addressed within the timeframe of the coming run-up to a Last Call WD.
>>>> > We'd be happy to arrange some sort of inter-WG call to try to make
>>>> > progress on this together.
>>>> >
>>>> >
>>>> > Thank you!
>>>> >
>>>> >
>>>> > Best regards,
>>>> >
>>>> >
>>>> > Joe Berkovitz
>>>> >
>>>> > co-chair Web Audio WG
>>>> >
>>>> >
>>>> > *Noteflight LLC*
>>>> > Boston, Mass.
>>>> > phone: +1 978 314 6271
>>>> > www.noteflight.com
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.noteflight.com&d=AwMFaQ&c=lI8Zb6TzM3d1tX4iEu7bpg&r=qzKCNHFKJMzZBJ52at1DkA-_8TPxvcij-zS_VXs8c5A&m=Ajygd3cU_M15NMeR6tSVYxwZiRtNw9yzWnTx0nK85QM&s=jXjlONf3ezJhUogvhWPTTov9Nkgv6NEMH3VU7EtbI5w&e=>
>>>> <http://www.noteflight.com/
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.noteflight.com_&d=AwMFaQ&c=lI8Zb6TzM3d1tX4iEu7bpg&r=qzKCNHFKJMzZBJ52at1DkA-_8TPxvcij-zS_VXs8c5A&m=Ajygd3cU_M15NMeR6tSVYxwZiRtNw9yzWnTx0nK85QM&s=TX3MVside5EU5bm_UNZyg2r1SdoBFsm-f8nP7K1k4Y8&e=>
>>>> >
>>>> > "Your music, everywhere"
>>>> >
>>>> > On Wed, Apr 15, 2015 at 1:31 AM, Stefan Håkansson LK
>>>> > <stefan.lk.hakansson@ericsson.com
>>>> > <mailto:stefan.lk.hakansson@ericsson.com>> wrote:
>>>> >
>>>> >     The WebRTC and Device APIs Working Groups request feedback on the
>>>> Last
>>>> >     Call Working Draft of Media Capture and Streams, a JavaScript API
>>>> that
>>>> >     enables access to cameras and microphones from Web browsers as
>>>> well as
>>>> >     control of the use of the data generated (e.g. rendering what a
>>>> camera
>>>> >     captures in a html video element):
>>>> >     http://www.w3.org/TR/2015/WD-mediacapture-streams-20150414/
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_TR_2015_WD-2Dmediacapture-2Dstreams-2D20150414_&d=AwMFaQ&c=lI8Zb6TzM3d1tX4iEu7bpg&r=qzKCNHFKJMzZBJ52at1DkA-_8TPxvcij-zS_VXs8c5A&m=Ajygd3cU_M15NMeR6tSVYxwZiRtNw9yzWnTx0nK85QM&s=iqvKvUbbXBvyilFPRoiU-moSntiBDqoGqKdbqREA2EY&e=>
>>>> >
>>>> >     The groups have identified the following other W3C Working Groups
>>>> as
>>>> >     likely sources of feedback:
>>>> >
>>>> >     - HTML Working Group, especially the HTML Media Task Force, as
>>>> our API
>>>> >     extends the HTMLMediaElement interface and defines a new type of
>>>> media
>>>> >     input via MediaStream
>>>> >
>>>> >     - WebApps Working Group, especially on the overall usage of Web
>>>> IDL and
>>>> >     the definition of error handling
>>>> >     Audio Working Group, as the Web Audio API builds upon the
>>>> MediaStream
>>>> >     interface
>>>> >
>>>> >     - WAI Protocol and Formats Working Group, especially on the
>>>> impact of
>>>> >     the user consent dialog and the applicability of the indicators of
>>>> >     device usage in assistive tools
>>>> >
>>>> >     - Web and TV Interest Group, as the manipulation of media input
>>>> can be
>>>> >     relevant to some of their use cases (e.g. glass to glass)
>>>> >
>>>> >     - Web App Security Working Group, especially on our links between
>>>> >     secured origins and persistent permissions, and our current
>>>> policy with
>>>> >     regard to handling access to this "powerful feature"
>>>> >
>>>> >     - Web Security Interest Group, especially on our security
>>>> considerations
>>>> >     Privacy Interest Group, as access to camera and microphone has
>>>> strong
>>>> >     privacy implications
>>>> >
>>>> >     - Technical Architecture Group, for an overall review of the API,
>>>> >     especially the introduction of the concept of a IANA
>>>> registry-based
>>>> >     constraints system, the use of promises, and our handling of
>>>> persistent
>>>> >     permissions
>>>> >
>>>> >     We naturally also welcome feedback from any other reviewers.
>>>> >
>>>> >     The end of last call review for this specification is set to May
>>>> 15
>>>> >     2015; should that deadline prove difficult to meet, please get in
>>>> touch
>>>> >     so that we can determine a new deadline for your group.
>>>> >
>>>> >     As indicated in the document, comments should be sent to the
>>>> >     public-media-capture@w3.org <mailto:public-media-capture@w3.org>
>>>> >     mailing list.
>>>> >
>>>> >     Thanks,
>>>> >
>>>> >     Frederick Hirsch, Device APIs Working Group Chair,
>>>> >     Harald Alvestrand and Stefan Hakansson, WebRTC Working Group
>>>> Chairs and
>>>> >     Media Capture Task Force Chairs
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > .            .       .    .  . ...Joe
>>>> >
>>>> > *Joe Berkovitz*
>>>> > President
>>>> >
>>>> > *Noteflight LLC*
>>>> > Boston, Mass.
>>>> > phone: +1 978 314 6271
>>>> > www.noteflight.com
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.noteflight.com&d=AwMFaQ&c=lI8Zb6TzM3d1tX4iEu7bpg&r=qzKCNHFKJMzZBJ52at1DkA-_8TPxvcij-zS_VXs8c5A&m=Ajygd3cU_M15NMeR6tSVYxwZiRtNw9yzWnTx0nK85QM&s=jXjlONf3ezJhUogvhWPTTov9Nkgv6NEMH3VU7EtbI5w&e=>
>>>> <http://www.noteflight.com
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.noteflight.com&d=AwMFaQ&c=lI8Zb6TzM3d1tX4iEu7bpg&r=qzKCNHFKJMzZBJ52at1DkA-_8TPxvcij-zS_VXs8c5A&m=Ajygd3cU_M15NMeR6tSVYxwZiRtNw9yzWnTx0nK85QM&s=jXjlONf3ezJhUogvhWPTTov9Nkgv6NEMH3VU7EtbI5w&e=>
>>>> >
>>>> > "Your music, everywhere"
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> .            .       .    .  . ...Joe
>>>>
>>>>
>>>>
>>>> *Joe Berkovitz*
>>>>
>>>> President
>>>>
>>>>
>>>>
>>>> *Noteflight LLC*
>>>>
>>>> 49R Day Street / Somerville, MA 02144 / USA
>>>>
>>>> phone: +1 978 314 6271
>>>>
>>>> www.noteflight.com
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.noteflight.com&d=AwMFaQ&c=lI8Zb6TzM3d1tX4iEu7bpg&r=qzKCNHFKJMzZBJ52at1DkA-_8TPxvcij-zS_VXs8c5A&m=Ajygd3cU_M15NMeR6tSVYxwZiRtNw9yzWnTx0nK85QM&s=jXjlONf3ezJhUogvhWPTTov9Nkgv6NEMH3VU7EtbI5w&e=>
>>>>
>>>> "Your music, everywhere"
>>>>
>>>
>>>
>>
>>
>> --
>> .            .       .    .  . ...Joe
>>
>> *Joe Berkovitz*
>> President
>>
>> *Noteflight LLC*
>> 49R Day Street / Somerville, MA 02144 / USA
>> phone: +1 978 314 6271
>> www.noteflight.com
>> "Your music, everywhere"
>>
>
>
Received on Monday, 18 May 2015 19:48:21 UTC

This archive was generated by hypermail 2.3.1 : Monday, 18 May 2015 19:48:22 UTC