Re: Fwd: draft proposal : active multiple devices for getUserMedia from Adam Bergkvist on 2012-04-17 (public-media-capture@w3.org from April 2012)

From: Adam Bergkvist <adam.bergkvist@ericsson.com>
Date: Tue, 17 Apr 2012 14:25:01 +0200
To: "Timothy B. Terriberry" <tterriberry@mozilla.com>
CC: "public-media-capture@w3.org" <public-media-capture@w3.org>
Message-ID: <4F8D611D.5060104@ericsson.com>

On 2012-04-17 11:11, Timothy B. Terriberry wrote:
> Adam Bergkvist wrote:
>>> And how do you see the user interaction dialog looking?
>>
>> If getUserMedia() is supposed to return several MediaStream objects then
>> the UI will be responsible for grouping the different devices into
>> MediaStreams which will add extra complexity to the UI. Therefore I'm
>
> I think the complexity comes from just making the user select multiple
> media sources of the same type at once, not the grouping or anything
> else. Consider the case of a user that wants to provide a fake camera
> (still image/video file loop) for one of the devices. What does getting
> all the cameras mean in that case? How do they know how many the web
> page would like to have vs. needs to have in order to work, etc.? Much
> less which order the tracks/streams should appear in.
>
> This gets particularly bad if the browser chrome always allows the user
> to add these extra streams (and it should, to allow users to use a
> website even if they don't have the requisite hardware), since they can
> be fed to webpages that aren't expecting it (the webpage can't tell the
> browser they aren't expecting it), and suddenly a page which feeds the
> LocalMediaStream directly to a PeerConnection is now sending multiple
> enabled video tracks, when I suspect many applications will only be
> designed to handle one.
>
> On the other hand, if we require multiple getUserMedia() calls, the page
> can simply have an "Open Front Camera" button and "Open Rear Camera"
> button, and things are relatively clear. The user is told what they're
> doing when they initiate the action, and the browser chrome UI only has
> to worry about selecting one video device, instead of allowing the user
> to add some arbitrary, unknown number of them for some arbitrary but
> unknown purposes. The JS doesn't get multiple tracks of the same type
> unless it specifically asks for and builds up the MediaStreams itself,
> and thus doesn't have to handle such complexity if it doesn't need to.
> The one wrinkle is how to handle audio, which should only be returned by
> one of those two getUserMedia() calls in the front/back case. But even
> there, I think just asking for it with the front camera is not too
> confusing or limiting.
>

I think what you're saying makes a lot of sense and it keeps things 
simple for the majority of the cases. I think I would be OK with such a 
limitation.

I think the following scenario would be manageable as well (not entirely 
supported at the moment):
* The app instructs getUserMedia() that it wants one audio device and 
two video devices.
* A single LocalMediaStream with three tracks is delivered to the app.
* Before starting the conversation, the app presents a dialog with 
self-views of both videos and asks the user to correctly label the front 
and the back cameras.
* The conversation starts.

/Adam

Received on Tuesday, 17 April 2012 12:25:31 UTC