Distinguishing video and audio streams to enable fine-grained feature control

Hi,

During the last few months we've received a number of requests from 
developers to provide more granular control over camera streams provided 
via getUserMedia. Such requests have centered around providing 
auto-focus feature detection/setting/monitoring, zoom 
detection/setting/monitoring, enabling camera flash in a sensible way, 
and changing the rotation/orientation of a webcam stream. Because the 
current specification does not incorporate these features at present we 
did some brainstorming and came up with the following idea.

Taking all of these use cases in to account, we would like to present 
the following proposal that allows a web application to:

1. Distinguish between audio and video media streams obtained via 
getUserMedia (by introducing VideoStreamTrack and AudioStreamTrack 
interfaces that inherit from the currently specified - generic - 
MediaStreamTrack interface).

2. Apply special behavior to VideoStreamTrack objects that allow a web 
developer to feature detect on the capabilities of a web camera, set 
their desired modes for a number of features related to a given Camera 
object, trigger certain features on-demand and listen for Camera feature 
start and end events where that makes sense.

When we have the ability to distinguish video stream tracks from audio 
stream tracks (in point 1 above) it became clear that there were many 
other use cases that we could then naturally fit in to the same API 
architecture.

The proposal herein therefore incorporates an initial set of controls 
for developers to tune certain web camera feature. We expect this will 
be of interest for a large number of other use cases, such as Camera 
capabilities detection and control in the form of a best-effort approach 
that is inclusive and 'webby' by design.

Our primary concern at this time is to get a sense for whether there's 
agreement around supplying these options to developers or not and then 
to subsequently discuss whether the interface proposal provided is fit 
for purpose. If anyone else has alternative proposals then we'd be 
interested to discuss them further here of course.


*** PROPOSAL START ***

// +++ VIDEOSTREAMTRACK

interface VideoStreamTrack {

   readonly attribute boolean locked; // true if object is not a
                                      // camera object or manipulation
                                      // is not otherwise supported
                                      // false if object can be
                                      // modified

   // CAMERA SELECTION

   readonly attribute boolean cameraChangeSupported; // true if camera
                                                     // can be changed.
                                                     // Otherwise, false.

   readonly attribute unsigned short numberOfCameras;

   readonly attribute unsigned short currentCamera;

   CameraInfo     getCameraInfo(in unsigned short value);

   void           setCamera(in unsigned short value);

   // AUTO-FOCUS

   readonly attribute boolean focusSupported; // true if camera focus
                                              // can be changed.
                                              // Otherwise, false.

   readonly attribute DOMString currentFocusMode;

   void           setFocusMode(in DOMString value);

   // value argument above may be one of 'auto' (default),
   // 'continuous', 'edof' (Extended Depth of Field), 'fixed',
   // 'infinity' or 'macro' (close-up focus mode).

   void           autoFocus(in optional Object xy);
   void           cancelAutoFocus();

   // ZOOM

   readonly attribute boolean zoomSupported; // true if camera zoom can
                                             // be changed. Otherwise,
                                             // false.

   readonly attribute unsigned short currentZoom;
   readonly attribute unsigned short maxZoom;

   void           startZoom(in unsigned short value);
   void           stopZoom();

   // FLASH

   readonly attribute boolean flashSupported; // true if camera flash
                                              // mode can be changed.
                                              // Otherwise, false.

   readonly attribute DOMString currentFlashMode;

   void           setFlashMode(in DOMString value);

   // value argument above may be one of 'auto' (default), 'off', 'on',
   // 'red-eye' or 'torch'.

   // ORIENTATION

   readonly attribute boolean orientationSupported; // true if camera
                                                    // orientation be
                                                    // changed.
                                                    // Otherwise, false

   readonly attribute unsigned short currentDisplayOrientation;
   // 0<->360 degrees

   void           setDisplayOrientation(in unsigned short degrees);

   // GENERAL FEATURES

   void           takePicture();

   // takePicture first applies currentFocusMode and currentFlashMode
   // before taking a JPEG image snapshot from currentCamera and firing
   // a new 'picture' event, setting its data attribute as the value of
   // the JPEG image's ImageData.

   // VIEWING ANGLE

   readonly attribute float horizontalViewAngle;
   readonly attribute float verticalViewAngle;

   // EVENT LISTENERS

            attribute EventListener  onfocusstart;
            attribute EventListener  onfocusend;

            attribute EventListener  onzoomstart;
            attribute EventListener  onzoomend;

            attribute EventListener  oncamerachange;

            attribute EventListener  onpicture;

}

VideoStreamTrack implements MediaStreamTrack;

VideoStreamTrack implements EventTarget;

// VideoStreamTrack events: 'focusstart', 'focusend', 'zoomstart',
// 'zoomend', 'camerachange' and 'picture'.

++ CAMERAINFO

[NoInterfaceObject]
interface CameraInfo {

            attribute unsigned short facing;

   // facing above will be either 'user' or 'environment'

            attribute unsigned short orientation;

   // orientation will be in the range 0 to 360 degrees

}

*** PROPOSAL END ***


--
Rich Tibbett (richt)
CORE Platform Architect - Opera Software ASA

Received on Tuesday, 14 August 2012 11:38:37 UTC