Re: Clarification on media capture split between WebRTC and DAP from Rob Manson on 2011-08-17 (public-webrtc@w3.org from August 2011)

From: Rob Manson <roBman@mob-labs.com>
Date: Wed, 17 Aug 2011 22:35:47 +1000
To: Harald Alvestrand <harald@alvestrand.no>, "public-webrtc@w3.org" <public-webrtc@w3.org>
Cc: Rich Tibbett <richt@opera.com>
Message-ID: <1313584547.6734.57966.camel@robslapu>
I could really see how the complexity of this goal could have created
overwhelming complexity...but Rich's solution does seem to kill quite a
few birds with one stone.  So I think I'd like to flip my position back
again 8)


> > 2. Require the web app to assign and display the tainted Stream object 
> > in a <video> element in order for the user to authorize its allowed 
> > capabilities (in Step 4 below).
> Is this synonymous with "let only a Video element have the capability of 
> untainting a stream"?

That's covered in point 4. below isn't it?


> > 3. Let the web app to provide a hint to the Stream object (or <video> 
> > element) for the type of access to the camera it actually requires - a 
> > list of one or more of the following tokens: 'camera', 'camcorder', 
> > 'streaming', 'telephony'.
> >
> > 4. Let the UA present the following <video>-overlaid stream control 
> > buttons to the user depending on the access hint(s) registered above:
> >
> >   - a 'camera' button > to take a still image capture to generate an 
> > image file that the web app can register a callback to receive.
> >   - a 'camcorder' button > to take a video capture to generate a video 
> > file that the web app can register a callback to receive.
> >   - a 'streaming' button > to allow the current web page to un-taint 
> > the Stream object and allow the web app to access that streaming data 
> > (e.g. via a <canvas> element or an Audio API). Either ON or OFF 
> > (default).
> >   - a 'telephony' button > to allow the current web page to then 
> > assign the Stream object to the P2P communication API without throwing 
> > e.g. a Security Violation error. Either ON or OFF (default).
> >
> > 5. On user click of any of the stream control buttons presented, 
> > enable the inferred functionality, fire a callback to the web app and 
> > let it continue about it's intended business.
> Doesn't this mean that we're back to "click a button to allow access 
> every time you call"?

Well it allows the user to see/hear a preview first before they give
approval/click start.  So it is an extra click but it's not like it is
just a redundant click that's being added.  This could really be an
improvement in the overall UX.


> I kind of like the idea that we use a <video> element with 
> browser-controlled controls on it for the authorization step (presumably 
> the app can hide that <video> element after having obtained the 
> authorization, if he so desires), but I'm fundamentally worried about 
> "extra click every time you call".

Really, an app having the ability to stream video without the user
seeing a preview is worse in many ways.

And of course it could possibly work in a similar way to the current
geolocation API permissions, etc. so that it's not required EVERY time.


> > This approach immediately lets the user see that the page is ready to 
> > do something with their camera without requiring an up-front 
> > prompt/access authorization. It lets the user get comfortable with the 
> > fact that the camera is on rather than it happening all at once. It 
> > lets them adjust their hair before they go live. It presents 
> > UA-controlled buttons within a <video> element for the user to 
> > authorize (or not) the requested, targeted usage when they are ready.
> >
> > FWIW, this discussion is entirely orthogonal to the P2P aspects 
> > ('streaming' in this email refers to streaming the video to the web 
> > page, or, untainting the provided Stream object. 'telephony' refers to 
> > the p2p communication stuff).
> >
> > I can understand the push-back but I'd be interested for webrtc to 
> > explore such an approach a little further. Accessing the camera 
> > without streaming the results to a remote server has 
> > wide-applicability e.g. AR guides, bar-code scanning web apps, 
> > camera/snapshot/fx web apps, on-demand a/v recording uploads, personal 
> > introductions, etc.
> It does, and if we can accept "click every time you <scan, photograph, 
> videotape, .....>" in all those contexts, this might be a general mechanism.

+1 to further exploration of this model.


roBman
Received on Wednesday, 17 August 2011 12:36:23 UTC