Re: PROPOSAL: Simple image capture API

Anant Narayanan wrote:
> What
> --
> We propose a simple addition to getUserMedia to support the simplest use-case of capturing an image from JavaScript, like so:
> 	
> 	navigator.getUserMedia({'picture': true}, function(Blob blob) { … });
>
> Why
> --
> There already exists a simple declarative API based on<input>  tag specified by the DAP working group (at http://www.w3.org/TR/2011/WD-html-media-capture-20110414/). It works by adding markup like:
>
> 	<input type="file" accept="image/*" id="capture">
>
> which, on user action, results in the ability to either select an image from local disk or the webcam.  We have found the need to expose the same functionality, but as an API (programmatic approach vs. the declarative one that has already been specified). Rather than coming up with a new API, we propose that we extend getUserMedia to capture this use-case as well.
>
> The proposed API also fulfills use case 2.1 "check out my new hat", as specified in the scenarios document (http://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/scenarios.html).
>

We've run in to the issue of having to figure out when and how to invoke 
(auto-)flash and (auto-)focus features to the web cam for the purposes 
of taking image snapshots via getUserMedia. The use case we had in mind 
is one of reading QR codes via the webcam:

http://shinydemos.com/qr-code/

Right now we aren't getting a consistently reasonable enough resolution 
for extracting the QR code from the image and there's currently no way 
to trigger the flash for low-light conditions. These are practical 
problems that we'd like to solve in the short term.

> How
> --
> Invoking navigator.getUserMedia({'picture': true}); will *not* result in a permission prompt (unlike when either 'audio' or 'video' are set to true). Instead, browser chrome will be presented to the user that allows them to preview local devices and snap a picture (or select a local image file). User action on the preview window is implied consent.

It seems that if you could synthesize click events on the input element 
approach presented above then you get this functionality 
programmatically without any additional API footprint e.g. input.click().

I was hoping for something subtly different. More along the lines of 
extracting a still image from an already authorized MediaStream object. 
By default we'd have auto-flash and auto-focus enabled when the user 
called a getSnapshot (or some such API) method but let web developers 
override those settings at run time if they so wish.

Anything that we can defer to the browser's chrome can be done through 
the declarative approach from DAP...and clicking that input button 
implies consent from the user. It is an implicit permissioning approach 
without cumbersome permission dialogs which is an excellent design IMO.

>
> It is already possible to take a picture with getUserMedia({'video':true}…), for example: https://gist.github.com/1852070 (real working code that runs on the Opera Labs version). However, we would like to avoid a dependency on MediaStreams as well as permission prompts for this simple use case and first implementation at Mozilla. I think this is a great starting point.
>
> The key difference from the existing getUserMedia spec is that the success callback contains a Blob object if 'picture' was set to true. Blob is fully specified in http://www.w3.org/TR/FileAPI/.
>
> This current proposal uses a callback approach but can be updated as necessary when we have consensus on events (a proposal which I will send out separately).
>
> I look forward to your feedback!

Thanks for getting the ball rolling on this :)

- Rich

Received on Monday, 12 March 2012 11:15:13 UTC