Re: Camera/Caputre API (was: DAP Roadmap, priorities) from Robin Berjon on 2009-11-18 (public-device-apis@w3.org from November 2009)

From: Robin Berjon <robin@robineko.com>
Date: Wed, 18 Nov 2009 15:26:39 +0100
To: Dominique Hazael-Massieux <dom@w3.org>
Cc: Ingmar.Kliche@telekom.de, public-device-apis@w3.org
Message-Id: <136676D5-8074-4D99-8C22-7EF75F6DED33@robineko.com>

On Nov 12, 2009, at 14:00 , Dominique Hazael-Massieux wrote:
> If you could do that, it would certainly help getting the ball rolling
> and would make it more likely we could adopt it as one of the “priority
> APIs”.

Going back over this thread this doesn't seem to be captured as an action item, do we need to add it?

> In terms of the security around it, I think there is a pretty large
> agreement that access to the API should be granted through the user
> interaction with the device (e.g. hitting the shutter button for a
> camera, including the camera’s visor as part of the requesting page,
> etc.), so it would be nice if the draft API took that idea in
> consideration, and if this was described (probably informatively) in a
> “security considerations” section.
> 
> A few people have suggested that the API be made available through a
> refinement of <input type='file' > with a type attribute set to an
> audio, image, or video mime type or set of mime types; maybe something
> to take into consideration as well. In any case, the interactions with
> the <video>, <audio> and <img> elements will need to be considered.

I would like to explore all of these approaches:

1) <input> based

  - user clicks on <input type='file' accept='image/*'> and gets (amongst other options) offered to take a picture

  - 1.a) a system viewfinder is used to capture the image, which is then returned to the page just like any other file (I believe this is more or less what the iPhone does). The File API can be used to manipulate the data for thumbnails, compositing, etc.

  - 1.b) a camera input is selected from a system dialog, which fires an event on the <input> element immediately. The event exposes an opaque URL that can be set as the source of a <video> element so that users can use that as a viewfinder. It further has some metadata about the camera's abilities, and some rudimentary controls (e.g. zooming).


2) application based

  - the application initiates an asynchronous request to have access to the camera. A non-intrusive, non-modal message appears (of the type that have become familiar for Geolocation, popups, local storage...). The user can choose to grant access or not in a way that doesn't force her hand.

  - we then have 2.a and 2.b modelled on the previous option, except that instead of an event it's a callback parameter.

The reason I'm interested in looking at (2) is because I'm concerned that we may be trying to pile too much onto <input> — we could, after all, have had <input type='location'>. If the geolocation model works, we should build on it. This doesn't prevent the <input>-based access — in fact for the capture API I like it very much because it makes a lot of sense semantically here. We can do both — so long as we build on existing models I feel rather safe that we're not doing weird stuff and that we're benefitting from improvements that can be made to other parts of the stack that rely on these approaches.

In both of these cases the user needs to ALWAYS have a way to turn the camera off. Not a huge hurdle to implement or specify, but well worth indicating.

--
Robin Berjon
  robineko — hired gun, higher standards
  http://robineko.com/

Received on Wednesday, 18 November 2009 14:27:14 UTC