Re: Do we need capabilities? from Neil Stratford on 2012-01-24 (public-webrtc@w3.org from January 2012)

From: Neil Stratford <nstratford@voxeo.com>
Date: Tue, 24 Jan 2012 09:18:59 +0000
To: public-webrtc@w3.org
Message-ID: <4F1E7783.5070306@voxeo.com>
On 24/01/2012 03:04, Anant Narayanan wrote:
> (Starting as a separate thread, to document objections to
> getCapabilities)

> 1. I think we can all agree that exposing capabilities without user
> consent of any form is not what we really want. If the current
> getCapabilities() is able to be invoked by any web page without any
> indication to the user, it is a massive privacy invasion. Ad services
> will then be able to add more bits of reliable information in order
> to personally identify visitors (they already know too much!).

I agree, getCapabilities() does require user approval, which could also 
be used to pre-approve access for a later getUserMeida() request.

> 2. Assuming we modify getCapabilities() to be a "trusted" call, i.e.
> requires user consent before returning, I do not think we will be
> able to satisfy application requirements. The primary reason for this
> is that the getCapabilities() call is done separately from
> getUserMedia(), and in the interim there might be changes to user
> hardware. Thus the application is not guaranteed to get what it
> wanted from the list of capabilities, anyway. This implies that UI
> affordances in applications *cannot* be made solely on the basis of a
> response from getCapabilities(), the application risks providing
> functionality that may not be present at the time the user actually
> initiates action.

Changes to hardware can equally happen after a call to getUserMedia() 
but before the session is actually established. A user could connect or 
disconnect a webcam at any point. However, I don't see this as a problem 
- especially if we can provide some kind of callback notification 
mechanism on any such changes that enables the UI to be updated. If I 
plug in a webcam after loading a web page I expect the UI to be updated 
to reflect that I can now make video calls.

> 4. The primary reason that a 'hints' based approach does not reveal
> as many bits as a capabilities based approach is that the result from
> getUserMedia() given a static set of hints is not guaranteed to be
> the same. It is temporal, and thus more unreliable that
> getCapabilities() -- which will always return the same value for a
> given hardware configuration. In a large number of cases, it is
> possible that getUserMedia will always return the same kind of
> stream, but I don't believe that justifies the need for
> getCapabilities().

If I provide a hint of "best possible", won't the resulting SDP payload 
(containing full codec list and associated parameters) leak far more 
information than a list of high-level profiles from getCapabilities()?

There are things that users expect that would not be possible without 
capabilities - for example rich presence displaying camera icons next to 
contacts who are available for video calls, or hiding the ability to 
call entirely if no microphone is available. I find it difficult to 
imagine a good UI that has no information about what may or may not be 
possible before an actual call is attempted.

My concern is primarily providing a good end user experience that 
doesn't require hacks like attempting dummy call setups to retrieve 
capability information - which is likely what developers will resort to 
if we don't provide such an API.

Neil
Received on Tuesday, 24 January 2012 09:19:54 UTC