W3C home > Mailing lists > Public > public-media-capture@w3.org > September 2012

Re: Settings retrieval/application API Proposal (formerly: constraint modification API v3)

From: Harald Alvestrand <harald@alvestrand.no>
Date: Mon, 10 Sep 2012 16:34:25 +0200
Message-ID: <504DFA71.7000605@alvestrand.no>
To: "Mandyam, Giridhar" <mandyam@quicinc.com>
CC: "public-media-capture@w3.org" <public-media-capture@w3.org>
On 09/10/2012 04:21 PM, Mandyam, Giridhar wrote:
> Pardon me, but Qualcomm Innovation Center (QuIC) is a member of the DAP Working Group.  My understanding was that this entitles QuIC to participate in the Media Capture Task Force.  I do not understand why my contribution is being singled out as exemplary of poor process.  Why not other recent contributions that have extended getUserMedia()? Or are only certain persons allowed to contribute to the spec?

My apologies - it was a purely technical concern, and has nothing 
whatsoever to do with your person or your organization. If it looked as 
if it was, I most humbly apologize.

This proposed interface proposes adding at least 47 new attributes to an 
interface that at the moment has something like two attributes 
(mandatory and optional constraints). Several of these need definitions 
that are not clear to me.

That's a lot of work.

>
> -----Original Message-----
> From: Harald Alvestrand [mailto:harald@alvestrand.no]
> Sent: Monday, September 10, 2012 2:51 AM
> To: public-media-capture@w3.org
> Subject: Re: Settings retrieval/application API Proposal (formerly: constraint modification API v3)
>
> I think this post illustrates very well why we have a direct conflict between a rich interface that allows all the things people want to do and a widely implemented interface that can be specified and agreed upon quickly.
>
> Can we find a way to separate those two goals? Perhaps in separate documents?
>
> On 09/08/2012 01:16 AM, Mandyam, Giridhar wrote:
>> Hello Travis & others in the group,
>>
>> I have tried to match my proposal to this and recommend the following additions.  Please let me know what you all think.  A couple of items to note:
>>
>> a) I'm not a WebIDL expert, so apologies in advance if I've stretched
>> the Dictionary construct beyond original intention
>> b) Added several boolean settings to the picture info dictionary.  Don't know if this is desirable for the targeted constraints, but an alternative would be to provide more settings of the type {"on","off"}.
>> c) Added a face detection feature and an associated event handler to the VideoDevice interface.
>>
>> In addition, there are a couple of items are not currently addressed:
>>
>> 1) Should the UA always return a JPEG image when takePicture() is invoked?  Should it just be default, with the developer having the ability to specify other formats?  Should there be an image format indicator returned with the image data to the onPicture event handler?
>> 2) If returning a JPEG image is possible, should JPEG quality be an adjustable parameter available to the developer?
>>
>> -Giri Mandyam, Qualcomm Innovation Center
>>
>>
>> //+++Video device settings
>> dictionary PictureInfo : MediaTrackConstraintSet {
>>      // Resolution:
>>      unsigned long width;
>>      unsigned long minWidth;
>>      unsigned long maxWidth;
>>      unsigned long height;
>>      unsigned long minHeight;
>>      unsigned long maxHeight;
>>      // Aspect Ratio:
>>      float horizontalAspectRatio;
>>      float minHorizontalAspectRatio;
>>      float maxHorizontalAspectRatio;
>>      float verticalAspectRatio;
>>      float minVerticalAspectRatio;
>>      float maxVerticalAspectRatio;
>>      // Rotation:
>>      float rotation;
>>      float minRotation; // if not supported, then min == max == current rotation value
>>      float maxRotation;
>>      // Zoom:
>>      unsigned long zoom;  // if not supported, then min == max == current rotation value
>>      unsigned long minZoom; // e.g., lens supports 55 (mm) - 250 (mm) zoom
>>      unsigned long maxZoom;
>>      // Exposure:
>>      unsigned long exposure;
>>      unsigned long minExposure;
>>      unsigned long maxExposure;
>>      // Direction:
>>      VideoFacingEnum facing;
>>      // Focus:
>>      VideoFocusModeEnum focusMode;
>>      // Flash:
>>      VideoFlashModeEnum flashMode;
>>    
>>    // ****** ADDITIONS FOLLOW ******
>>    // Autoexposure:
>> AutoExposureModeEnum autoExposureMode;
>>
>> // Brightness:
>> float brightness;
>> float minBrightness; // if not supported, then min == max == current
>> brightness value float maxBrightness;
>>
>> // Contrast:
>> float brightness;
>> float minContrast; // if not supported, then min == max == current
>> contrast value float maxContrast;
>>
>> // Denoise:
>> boolean denoise;// Default is false; true setting may be ignored if UA
>> doesn't support
>>
>> // Effects:
>>    EffectsEnum effects;
>>
>> // Exposure Compensation:
>> float exposureCompensation;
>> float minExposureCompensation; // if not supported, then min == max ==
>> current exposure compensation value float maxExposureCompensation;
>>
>> // Face Detection:  enable face detection boolean faceDetection;//
>> Default is false; true setting may be ignored if UA doesn't support
>>
>> // Geotagging:
>> boolean geotagging;// Default is false; true setting may be ignored if UA doesn't support.  Note that if UA does not support JPEG then this feature is disabled.
>>
>> // High dynamic range image capture:
>> boolean highDynamicRange;// Default is false; true setting may be
>> ignored if UA doesn't support
>>
>> // ISO setting:
>> ISOEnum iso;// Controls the ISO setting.  "automatic" is default setting.
>>
>> // Red eye reduction:
>> boolean redEyeReduction;// Default is false; true setting may be
>> ignored if UA doesn't support
>>
>> // Saturation:
>> float saturation;
>> float minSaturation; // if not supported, then min == max == current
>> saturation compensation value float maxSaturation;
>>
>> // Scene mode:
>> SceneModeEnum sceneMode;//Controls the scene mode setting; default is "off"
>>
>> // Sharpness:
>> float sharpness;
>> float minSharpness; // if not supported, then min == max == current
>> sharpness value float maxSharpness;
>>
>> // Shutter sound: enable or disable shutter sound effect boolean
>> shutterSound;// Default is false; true setting may be ignored if UA
>> doesn't support
>>
>> // Skin tone enhancement:
>> boolean skintoneEnhancement;// Default is false; true setting may be
>> ignored if UA doesn't support
>>
>> // White balance mode:
>> WhiteBalanceModeEnum whiteBalanceMode;//Controls the white balance mode setting; default is "auto"
>>
>> // Zero shutter lag:
>> Boolean zeroShutterLag;// Default is false; true setting may be
>> ignored if UA doesn't support // ****** END OF ADDITIONS TO PICTURE
>> SETTINGS ****** };
>>
>> // ****** ADDITIONAL ENUMS ******
>> enum AutoExposureModeEnum = { "frame-average", "center-weighted",
>> "spot-metering" }; enum
>> EffectsEnum={"none","mono","negative","solarize","posterize","aqua","s
>> epia","whiteboard","blackboard","emboss","sketch","neon"};
>> enum ISOEnum={"automatic","100","200","400","800","1250"};
>> enum
>> SceneModeEnum={"off","auto","AR","action","backlight","barcode","beach
>> ","candlelight","fireworks","flowers","landscape","night","night-portr
>> ait","party","portrait","snow","sports","steadyphoto","sunset","theate
>> r"}; enum
>> WhiteBalanceModeEnum={"auto","incandescent","fluorescent","warm-fluore
>> scent","daylight","cloud-daylight","twilight","shade"};
>>
>> // Returned object for face detection
>> interface faceEvent {
>> 	readonly attribute unsigned long faces;// Number of faces detected in scene
>> 	face item (unsigned long index);
>> }
>> // face describes the rectangular encapsulation of a detected face when face detection is enabled.
>> Interface face{
>> 	readonly attribute unsigned long top;
>> 	readonly attribute unsigned long bottom;
>> 	readonly attribute unsigned long left;
>> 	readonly attribute unsigned long right; }
>>
>> interface VideoDevice : MediaDevice {
>>      //+++ Getting settings (possibly different from the video stream) for pictures from this device
>>      PictureInfo getPictureSettings();
>>
>>      //+++ Taking snapshots
>>      void takePicture(optional PictureInfo pictureSettings);
>>
>>      //+++ Picture results
>>      attribute EventHandler onpicture;
>>
>>      //+++ Face detection results
>>      attribute EventHandler onFaceDetect;// Handler will be passed a
>> faceEvent object }; -----Original Message-----
>> From: Travis Leithead [mailto:travis.leithead@microsoft.com]
>> Sent: Monday, August 27, 2012 2:47 PM
>> To: public-media-capture@w3.org
>> Subject: Settings retrieval/application API Proposal (formerly:
>> constraint modification API v3)
>>
>> Based on the latest round of feedback on the prior proposal [1], I've further adjusted the constraint modification proposal. High-level changes are:
>>
>> 1. LocalMediaStream allows more direct access to device settings (now
>> optimizing around the 1-video/1-audio track per gUM request) 2. Track
>> objects isolated from device objects for clarity and separation of
>> APIs 3. Specific settings proposed 4. Usage examples provided
>>
>>
>> As mentioned in the previous proposal [1], the LocalMediaStream's changeable audio/videoTracks collections as currently derived from MediaStream make it challenging to keep track of the tracks that are supplied by a local device over time. In the prior proposal, I factored the local-device-supplying tracks into separate track lists for isolation. In this proposal, I take a slightly more aggressive approach to modifying the definition of a LocalMediaStream which further diverges it from its current definition, but which (I believe) aligns it more closely with the devices that are supplying its tracks. This approach was largely borrowed from Adam's comments [2].
>>
>> Despite these changes to the structure of LocalMediaStream, I still want it to behave semantically similar to MediaStream when used with URL.createObjectURL or when assigned to a video/audio element using a TBD property (see example #2 at the end of the proposal). To continue to have it work for this purpose, a new interface: AbstractMediaStream is introduced:
>>
>> // +++New base class--this is what createObjectURL and other // APIs now accept to be inclusive of LocalMediaStreams as well // as other MediaStreams interface AbstractMediaStream {
>>      readonly attribute DOMString label;
>>      readonly attribute boolean ended;
>>      attribute EventHandler onended;
>> };
>>
>> // +++MediaStream now derives from the base class, adding // mutable track lists and an onstarted event (since these // objects have go from no tracks->one-or-more tracks) [Constructor (optional (MediaStream? or MediaStreamTrackList or MediaStreamTrack[]) trackContainers)] interface MediaStream : AbstractMediaStream {
>>      readonly attribute MediaStreamTrackList audioTracks;
>>      readonly attribute MediaStreamTrackList videoTracks;
>>      //+++ added for symmetry and for mutable track lists.
>>      attribute EventHandler onstarted;
>> };
>>
>> // +++Modified to include device-specific interfaces interface LocalMediaStream : AbstractMediaStream {
>>      readonly attribute VideoDevice? videoDevice;
>>      readonly attribute AudioDevice? audioDevice;
>>      void stop();
>> };
>>
>> A LocalMediaStream now has an active (or null) videoDevice, audioDevice or both, depending on what was requested from getUserMedia.
>>
>> All Video/AudioDevice objects link to their associated track:
>>
>> // +++Settings for all device types
>> interface MediaDevice {
>>      // +++ the track object that this device is producing
>>      readonly attribute MediaStreamTrack track; };
>>
>> And have a 'local' stop API and associated state (to stop just one of the devices):
>>
>> interface MediaDevice : EventListener {
>>      readonly attribute MediaStreamTrack track;
>>      // +++ stop only this device:
>>      void stop();
>>      // +++ get the on/off state of the device
>>      readonly attribute boolean ended;
>> };
>>
>> All devices support being able to inspect and change their settings. Application of settings is asynchronous, but inspection of settings can be synchronous (for convenience). The specific settings returned depend on whether the caller is a VideoDevice instance or an AudioDevice instance. Events related to the changing of settings are provided as well.
>>
>> interface MediaDevice : EventListener {
>>      readonly attribute MediaStreamTrack track;
>>      void stop();
>>      readonly attribute boolean ended;
>>
>>      // +++ get device settings
>>      (VideoInfo or AudioInfo) getSettings();
>>
>>      //+++ settings application
>>      void changeSettings(MediaTrackConstraints settings);
>>
>>      //+++ Async results notification from settings application
>>      attribute EventHandler onsettingschanged;
>>      attribute EventHandler onsettingserror; };
>>
>> Video devices, in particular, have the ability to [possibly] switch into "photo mode" to capture still images. The following are specific to VideoDevice objects and extend the API that Rich proposed. Since "photo mode" is often a distinct set of settings from regular video mode in a camera, there are separate settings and application of those settings just for taking pictures.
>>
>> //+++ New: audio Device interface (basically just a MediaDevice)
>> interface AudioDevice : MediaDevice { };
>>
>> //+++ New: video device
>> interface VideoDevice : MediaDevice {
>>      //+++ Getting settings (possibly different from the video stream) for pictures from this device
>>      PictureInfo getPictureSettings();
>>
>>      //+++ Taking snapshots
>>      void takePicture(optional PictureInfo pictureSettings);
>>
>>      //+++ Picture results
>>      attribute EventHandler onpicture;
>> };
>>
>> The proposed initial set of settings are below. The proposed settings are a combination of features proposed by Rich based on his customer's requests, as well as a set of functionality already supported by Microsoft WinRT Camera API [3] (based on our own research and common cameras used in PCs).
>>
>> In the proposed view of the settings, where the setting is in a range (not an enum), there are no "isSupported" values. The expectation is that if one of these values that is exposed as a range is not supported, then the value is assigned a default value (e.g., 0), and the min and max range values are set to that same value. This is the same thing as saying that the feature is supported, but that the value cannot be changed (which is basically saying that the feature is unavailable).
>>
>> //+++Video device settings
>> dictionary PictureInfo : MediaTrackConstraintSet {
>>      // Resolution:
>>      unsigned long width;
>>      unsigned long minWidth;
>>      unsigned long maxWidth;
>>      unsigned long height;
>>      unsigned long minHeight;
>>      unsigned long maxHeight;
>>      // Aspect Ratio:
>>      float horizontalAspectRatio;
>>      float minHorizontalAspectRatio;
>>      float maxHorizontalAspectRatio;
>>      float verticalAspectRatio;
>>      float minVerticalAspectRatio;
>>      float maxVerticalAspectRatio;
>>      // Rotation:
>>      float rotation;
>>      float minRotation; // if not supported, then min == max == current rotation value
>>      float maxRotation;
>>      // Zoom:
>>      unsigned long zoom;  // if not supported, then min == max == current rotation value
>>      unsigned long minZoom; // e.g., lens supports 55 (mm) - 250 (mm) zoom
>>      unsigned long maxZoom;
>>      // Exposure:
>>      unsigned long exposure;
>>      unsigned long minExposure;
>>      unsigned long maxExposure;
>>      // Direction:
>>      VideoFacingEnum facing;
>>      // Focus:
>>      VideoFocusModeEnum focusMode;
>>      // Flash:
>>      VideoFlashModeEnum flashMode;
>> };
>>
>> //+++ Additional settings for video (extends picture) dictionary VideoInfo : PictureInfo {
>>      // FPS:
>>      float framesPerSecond;
>>      float minFramesPerSecond;
>>      float maxFramesPerSecond;
>> };
>>
>> //+++Audio device settings
>> dictionary AudioInfo : MediaTrackConstraintSet {
>>      // Levels
>>      unsigned long level;
>>      unsigned long minLevel;
>>      unsigned long maxLevel;
>>      // Tone (bass/treble)
>>      float bassTone;
>>      float minBassTone;
>>      float maxBassTone;
>>      float trebleTone;
>>      float minTrebleTone;
>>      float maxTrebleTone;
>> };
>>
>> The related enums are defined as:
>>
>> enum VideoFacingEnum = { "unknown", "user", "environment" }; enum
>> VideoFocusModeEnum = { "nofocus", "fixed", "auto", "continuous",
>> "edof", "infinity", "macro" }; enum VideoFlashModeEnum = { "noflash",
>> "auto", "off", "on", "red-eye", "torch" };
>>
>> The new Event types that support the "settingschanged" and "settingserror" events, as well as the "picture" event are defined below:
>>
>> //+++ New event for "settingschanged/settingserror"
>> [Constructor(DOMString type, optional EventInit eventInitDict)] interface MediaSettingsEvent : Event {
>>      sequence<DOMString> getRelatedSettings(); // Returns an array of setting names that apply to this event.
>> };
>>
>> //+++ New event for getting the picture results from 'takePicture' (returns raw bytes/non-encoded) [Constructor(DOMString type, optional PictureEventInit eventInitDict)] interface PictureEvent : Event {
>>      readonly attribute ImageData data; // See Canvas spec for
>> definition of ImageData };
>>
>> dictionary PictureEventInit : EventInit {
>>      ImageData data;
>> };
>>
>> ////////////////////
>>
>> Some examples follow that illustrate how these changes will impact coding patterns:
>>
>> 1. Getting access to a video and/or audio device (if available) -- scenario is unchanged:
>>
>> navigator.getUserMedia({audio: true, video: true}, gotMedia,
>> failedToGetMedia);
>>
>> function gotMedia(localStream) {
>> }
>>
>> 2. Previewing the local video/audio in HTML5 video tag -- scenario is unchanged:
>>
>> function gotMedia(localStream) {
>>      // objectURL technique
>>      document.querySelector("video").src = URL.createObjectURL(localStream, { autoRevoke: true });
>>      // direct-assign technique
>>      document.querySelector("video").streamSrc = localStream; //
>> "streamSrc" is hypothetical and TBD at this time }
>>
>> 3. Applying resolution constraints
>>
>> function gotMedia(localStream) {
>>      var settings = localStream.videoDevice.getSettings();
>>      // Check for 1080p+ support
>>      if ((settings.maxWidth >= 1920) && (settings.maxHeight >= 1080)) {
>>         // See if I need to change the current settings...
>>         if ((settings.width != 1920) && (settings.height != 1080)) {
>>            settings.width = 1920;
>>            settings.height = 1080;
>>            localStream.videoDevice.onsettingserror = failureToComply;
>>            localStream.videoDevice.changeSettings(settings);
>>         }
>>      }
>>      else
>>         failureToComply();
>> }
>>
>> function failureToComply(e) {
>>      if (e)
>>         console.error("Device failed to change " + e.getRelatedSettings());
>>      else
>>         console.error("Device doesn't support at least 1080p"); }
>>
>> 4. Changing zoom in response to user input:
>>
>> function gotMedia(localStream) {
>>      setupRange( localStream.videoDevice ); }
>>
>> function setupRange(videoDevice) {
>>      var cameraSettings = videoDevice.getSettings();
>>      // Set HTML5 range control to min/max values of zoom
>>      var zoomControl = document.querySelector("input[type=range]");
>>      zoomControl.min = cameraSettings.minZoom;
>>      zoomControl.max = cameraSettings.maxZoom;
>>      zoomControl.device = videoDevice; // Store the device ref for later
>>      zoomControl.onchange = applySettingChanges; }
>>
>> function applySettingChanges(e) {
>>      e.target.device.changeSettings({ zoom: e.target.value }); }
>>
>> 5. Adding the local media tracks into a new media stream
>>
>> function gotMedia(localStream) {
>>      return new MediaStream( [ localStream.videoDevice.track,
>> localStream.audioDevice.track ]); }
>>
>> 6. Take a picture, show the picture in a canvas.
>>
>> function gotMedia(localStream) {
>>      localStream.videoDevice.onpicture = showPicture;
>>      // Turn on flash only for the snapshot...if available
>>      var picSettings = localStream.videoDevice.getPictureSettings();
>>      if (picSettings.flashMode != "noflash")
>>         localStream.videoDevice.takePicture({ flashMode: "on"});
>>      else {
>>         console.info("Flash not available");
>>         localStream.videoDevice.takePicture();
>>      }
>> }
>>
>> function showPicture(e) {
>>      var ctx = document.querySelector("canvas").getContext("2d");
>>      // e.data is the ImageData property of the PictureEvent interface.
>>      ctx.canvas.width = e.data.width;
>>      ctx.canvas.height = e.data.height;
>>      ctx.putImageData(e.data);
>>      // TODO: can get this picture as an encoded Blob via:
>>      // ctx.canvas.toBlob(callbackFunction, "image/jpeg"); }
>>
>> [1]
>> http://lists.w3.org/Archives/Public/public-media-capture/2012Aug/0066.
>> html [2]
>> http://lists.w3.org/Archives/Public/public-media-capture/2012Aug/0095.
>> html [3]
>> http://msdn.microsoft.com/en-us/library/windows/apps/windows.media.devices.videodevicecontroller.aspx
>>
>>
>>
>>
>
Received on Monday, 10 September 2012 14:34:50 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:01 GMT