Re: Output device enumeration from Justin Uberti on 2013-09-05 (public-media-capture@w3.org from September 2013)

From: Justin Uberti <juberti@google.com>
Date: Thu, 5 Sep 2013 15:45:08 -0700
To: Martin Thomson <martin.thomson@gmail.com>
Cc: "public-media-capture@w3.org" <public-media-capture@w3.org>
Message-ID: <CAOJ7v-2u4UwEN8rjxiwTMeQ2TJDsysJixuu-L=6t74QLwRczow@mail.gmail.com>

On Wed, Aug 28, 2013 at 8:47 AM, Martin Thomson <martin.thomson@gmail.com>wrote:

> Justin's proposal for this is perfectly sound, but - again - it raises
> the question about how much we want to leak in terms of fingerprinting
> surface and - under the assumption that this is "as little as
> possible" - what we plan to do to bridge the gap between unusable and
> fingerprinting extravaganza.
>
> (This is probably the wrong venue for this, so I'll forward as
> appropriate, at least sharing it here gets to some of the right
> audience.)
>
> For *input* device enumeration, we reached a place where you could
> obtain a list of abstract device identifiers.  This would, to all
> intents, limit the knowledge an application could gain to the number
> of devices.  Unless that application had previously been given access
> to device information and the device identifiers hadn't been reset.
> Applications are given access to more complete device information when
> they are given access to a device.  So that knowledge is good until
> the user resets the device identifiers, which share fate with cookies.
>
> I am going to propose that a similar scheme be used for output device
> enumeration.  This presents a challenge though, because there isn't
> necessarily an analogous consent event to getUserMedia() for output
> devices, which would enable the lifting of the kimono.  But it is
> possible that such an event is unnecessary in this particular case.
>
> So, here's the proposal.  <audio> and <video> tags have a readable and
> writable attribute that identifies the device that the media is being
> played to.  Then there's Justin's proposed getMediaSinks() interface
> for enumerating these identifiers.  And from a standardization
> perspective, that's it.  No permissions to access the content, and no
> extra features.
>
> The real problem that this doesn't solve is the attachment of
> user-friendly labels to devices (as much as the label my operating
> system gives to my headset is friendly).  And that's probably OK.  The
> browser might provide a way for users to select where media is
> directed, probably only for audio, and probably in the same place it
> would provide feedback about the origin (not web origin) of the
> content.
>
> A site can train itself on what devices to use in various ways and
> then remember user preferences.
>

I was with you up until this point - can you explain more about what you
had in mind here, and how the browser and application would interact to
accomplish this?

>
> I'm open to suggestions on what event might be considered a
> kimono-lifting event, such that the device information could be
> augmented with labels and other such niceties.  I remain opposed to
> doing the whole "ask the user" thing all over.  We know the quality of
> the consent is poor, and it further trains users to click through
> these things.
>

For communication apps, the gUM kimono lift should be sufficient; it would
be uncommon to set audio output separately from audio input (in fact, we
should probably surface information about which output devices are paired
with input devices, so switching to a USB headset only requires 1 click,
not 2).

For advanced digital audio apps, we don't have the same escape hatch.
However, I am inclined to punt on this particular problem for now. It might
turn out that output devices by themselves don't add a significant amount
of fingerprinting information.

Received on Thursday, 5 September 2013 22:45:55 UTC