- From: Mercurial notifier <cvsmail@w3.org>
- Date: Thu, 04 Sep 2014 07:52:37 +0000
- To: public-dap-commits@w3.org
changeset: 580:3fcd6e6c0a1f parent: 578:89dc4a551fce user: Dominique Hazael-Massieux <dom@w3.org> date: Thu Aug 28 14:13:32 2014 +0200 files: media-stream-capture/ImageCapture.html media-stream-capture/MediaRecorder.html media-stream-capture/scenarios.html description: drafts moved to github diff -r 89dc4a551fce -r 3fcd6e6c0a1f media-stream-capture/ImageCapture.html --- a/media-stream-capture/ImageCapture.html Wed Aug 20 11:56:18 2014 +0300 +++ b/media-stream-capture/ImageCapture.html Thu Aug 28 14:13:32 2014 +0200 @@ -2,400 +2,11 @@ <html xmlns='http://www.w3.org/1999/xhtml' lang='en'> <head> <meta charset='utf-8'/> + <meta http-equiv="refresh" content="0; url=http://w3c.github.io/mediacapture-image/"> + <link rel="canonical" href="http://w3c.github.io/mediacapture-image/"> <title>Mediastream Image Capture</title> - <script class='remove'> - var respecConfig = { - specStatus: "ED" - , shortName: "image-capture" - , editors: [ - { name: "Giridhar Mandyam" - , company: "Qualcomm Innovation Center, Inc" - , companyURL: "http://www.qualcomm.com/about/businesses/quicinc" } - ] - , edDraftURI: "https://dvcs.w3.org/hg/dap/raw-file/default/media-stream-capture/ImageCapture.html" - , copyrightStart: 2012 -// , publishDate: "2013-07-09" - , noIDLIn: true - , wg: ["Web Real-Time Communication Working Group", "Device APIs Working Group"] - , wgURI: ["http://www.w3.org/2011/04/webrtc/","http://www.w3.org/2009/dap"] - , wgPublicList: "public-media-capture" - , wgPatentURI: ["http://www.w3.org/2004/01/pp-impl/47318/status", "http://www.w3.org/2004/01/pp-impl/43696/status"] - , isRecTrack: true - , isNoTrack: false - , format: 'markdown' - }; - </script> - <script src='https://www.w3.org/Tools/respec/respec-w3c-common.js' class='remove' async></script> </head> <body> - <section id='sotd'> - Comments on this document are welcomed. - </section> - - <section id='abstract'> - This document specifies the <code>takePhoto()</code> and <code>getFrame()</code> methods, and corresponding camera settings for use with MediaStreams as defined in Media Capture and Streams [[!GETUSERMEDIA]]. - </section> - - Introduction - ------------ - <p>The API defined in this document takes a valid MediaStream and returns an encoded image in the form of a <code>Blob</code> (as defined in [[!FILE-API]]). The image is - provided by the capture device that provides the MediaStream. Moreover, - picture-specific settings can be optionally provided as arguments that can be applied to the image being captured.</p> - - Image Capture API - -------------- - <dl title='[Constructor(VideoStreamTrack track)] interface ImageCapture : EventTarget' class='idl'> - <dd>interface ImageCapture: EventTarget</dd> - <dt>readonly attribute PhotoSettingsOptions photoSettingsOptions</dt> - <dd>Describes current photo settings</dd> - <dt>readonly attribute VideoStreamTrack videoStreamTrack</dt> - <dd>The MediaStream passed into the constructor</dd> - <dt>attribute EventHandler onphoto</dt> - <dd>Register/unregister for photo events of type <code>BlobEvent</code>. The handler should expect to get a <code>BlobEvent</code> object as its first parameter.</dd> - <dt>attribute EventHandler onerror</dt> - <dd>Register/unregister for Image Capture error events of type <code>ImageCaptureErrorEvent</code>. The handler should expect to get a <code>ImageCaptureError</code> object as its first parameter.</dd> - <dt>attribute EventHandler onphotosettingschange</dt> - <dd>Register/unregister for photo settings change events of type <code>SettingsChangeEvent</code>.</dd> - <dt>attribute EventHandler onframegrab</dt> - <dd>Register/unregister for frame capture events of type <code>FrameGrabEvent</code>. The handler should expect to get a <code>FrameGrabEvent</code> object as its first parameter.</dd> - <dt>void setOptions(PhotoSettings? photoSettings)</dt> - <dd>When the <code>setOptions()</code> method of an <code>ImageCapture</code> object is invoked, then - then a valid <code>PhotoSettings</code> object <em title="must" class="rfc2119">must</em> be passed in the method to the - <code>ImageCapture</code> object. If the UA can successfully apply the settings, then the UA <em title="must" class="rfc2119">must</em> fire a <code>SettingsChangeEvent</code> event at the - <code>onphotosettingschange</code> event handler (if specified). If the UA cannot successfully apply the settings, then the UA - <em title="must" class="rfc2119">must</em> fire a <code>ImageCaptureErrorEvent</code> at the <code>ImageCapture</code> object whose <code>code</code> is set to SETTINGS_ERROR. </dd> - <dt>void takePhoto ()</dt> - <dd>When the <code>takePhoto()</code> method of an <code>ImageCapture</code> object is invoked, - then if the <code>readyState</code> of the <code>VideoStreamTrack</code> provided in the constructor is not "live", the UA <em title="must" class="rfc2119">must</em> fire a <code>ImageCaptureErrorEvent</code> event at the <code>ImageCapture</code> object with a - new <code>ImageCaptureError</code> object whose <code>code</code> is set to INVALID_TRACK. If the UA is unable to execute the <code>takePhoto()</code> method for any - other reason (for example, upon invocation of multiple takePhoto() method calls in rapid succession), then the UA <em title="must" class="rfc2119">must</em> fire a <code>ImageCaptureErrorEvent</code> event at the <code>ImageCapture</code> object with a - new <code>ImageCaptureError</code> object whose <code>code</code> is set to PHOTO_ERROR. - Otherwise it <em title="must" class="rfc2119">must</em> - queue a task, using the DOM manipulation task source, that runs the following steps: - <ol> - <li>Gather data from the <code>VideoStreamTrack</code> into a <code>Blob</code> containing a single still image. The method of doing - this will depend on the underlying device. Devices - may temporarily stop streaming data, reconfigure themselves with the appropriate photo settings, take the photo, - and then resume streaming. In this case, the stopping and restarting of streaming <em title="should" class="rfc2119">should</em> - cause <code>mute</code> and <code>unmute</code> events to fire on the Track in question. </li> - <li>Raise a <code>BlobEvent</code> event containing the <code>Blob</code> to the <code>onphoto</code> event handler (if specified).</li></dd> - <dt>void getFrame()</dt> - <dd>When the <code>getFrame()</code> method of an <code>ImageCapture</code> object is invoked, then if the <code>readyState</code> of the <code>VideoStreamTrack</code> provided in the contructor is not "live", the UA <em title="must" class="rfc2119">must</em> fire a <code>ImageCaptureErrorEvent</code> event at the <code>ImageCapture</code> object with a - new <code>ImageCaptureError</code> object whose <code>code</code> is set to INVALID_TRACK. If the UA is unable to execute the <code>getFrame()</code> method for any - other reason, then the UA <em title="must" class="rfc2119">must</em> fire a <code>ImageCaptureErrorEvent</code> event at the <code>ImageCapture</code> object with a - new <code>ImageCaptureError</code> object whose <code>code</code> is set to FRAME_GRAB_ERROR. Otherwise it <em title="must" class="rfc2119">must</em> - queue a task, using the DOM manipulation task source, that runs the following steps: - <ol> - <li>Gather data from the <code>VideoStreamTrack</code> into a <code>ImageData</code> object (as defined in [[!CANVAS-2D]]) containing a single still frame in RGBA format. The <code>width</code> and <code>height</code> of the - <code>ImageData</code> object are derived from the constraints of the <code>VideoStreamTrack</code>. </li> - <li>Raise a <code>FrameGrabEvent</code> event containing the <code>ImageData</code> to the <code>onframegrab</code> event handler (if specified). {Note: <code>getFrame()</code> returns data only once upon being invoked.}</li> - </dd> - </dl> - - <code>FrameGrabEvent</code> - -------------- - <dl title='[Constructor(DOMString type, optional FrameGrabEventInit frameGrabInitDict)] interface FrameGrabEvent : Event' class='idl'> - <dt>readonly attribute ImageData imageData</dt> - <dd>Returns an <code>ImageData</code> object whose <code>length</code> and <code>height</code> attributes indicates the dimensions of the captured frame. </dd> - </dl> - - ##### <code>FrameGrabEventInit</code> Dictionary - - <dl title='dictionary FrameGrabEventInit : EventInit' class='idl'> - <dt>ImageData imageData</dt> - <dd>An <code>ImageData</code> object containing the data to deliver via this event.</dd> - </dl> - - <code>ImageCaptureErrorEvent</code> - -------------- - <dl title='[Constructor(DOMString type, optional ImageCaptureErrorEventInit imageCaptureErrorInitDict)] interface ImageCaptureErrorEvent : Event' class='idl'> - <dt>readonly attribute ImageCaptureError imageCaptureError</dt> - <dd>Returns a <code>ImageCaptureError</code> object whose <code>code</code> attribute indicates the type of error occurrence. </dd> - </dl> - - ##### <code>ImageCaptureErrorEventInit</code> Dictionary - - <dl title='dictionary ImageCaptureErrorEventInit : EventInit' class='idl'> - <dt>ImageCaptureError imageCaptureError</dt> - <dd>A <code>ImageCaptureError</code> object containing the data to deliver via this event.</dd> - </dl> - - <code>BlobEvent</code> - -------------- - <dl title='[Constructor(DOMString type, optional BlobEventInit blobInitDict)] interface BlobEvent : Event' class='idl'> - <dt>readonly attribute Blob data</dt> - <dd>Returns a <code>Blob</code> object whose type attribute indicates the encoding of the blob data. An implementation must return a Blob in a format that is capable of being viewed in an HTML <code><img></code> tag. </dd> - </dl> - - ##### <code>BlobEventInit</code> Dictionary - - <dl title='dictionary BlobEventInit : EventInit' class='idl'> - <dt>Blob data</dt> - <dd>A <code>Blob</code> object containing the data to deliver via this event.</dd> - </dl> - - <code>SettingsChangeEvent</code> - -------------- - <dl title='[Constructor(DOMString type, optional SettingsChangeEventInit photoSettingsInitDict)] interface PhotoSettingsEvent : Event' class='idl'> - <dt>readonly attribute PhotoSettings photoSettings</dt> - <dd>Returns a <code>PhotoSettings</code> object whose type attribute indicates the current photo settings. </dd> - </dl> - - ##### <code>SettingsChangeEventInit</code> Dictionary - - <dl title='dictionary SettingsChangeEventInit : EventInit' class='idl'> - <dt>PhotoSettings photoSettings</dt> - <dd>A <code>PhotoSettings</code> object containing the data to deliver via this event.</dd> - </dl> - - <code>ImageCaptureError</code> - ----------------- - <p>The <code>ImageCaptureError</code> object is passed to an <code>onerror</code> event handler of an - <code>ImageCapture</code> object if an error occurred when the object was created or any of its methods were invoked.</p> - <dl title='[NoInterfaceObject] interface ImageCaptureError' class='idl'> - <dt>const unsigned short FRAME_GRAB_ERROR=1</dt> - <dd>An <code>ImageCaptureError</code> object must set its <code>code</code> value to this constant if an error occurred upon invocation of the <code>getFrame()</code> method of the <code>ImageCapture</code> interface.</dd> - <dt>const unsigned short SETTINGS_ERROR=2</dt> - <dd>An <code>ImageCaptureError</code> object must set its <code>code</code> value to this constant if an error occurred upon invocation of the <code>setOptions()</code> method of the <code>ImageCapture</code> interface.</dd> - <dt>const unsigned short PHOTO_ERROR=3</dt> - <dd>An <code>ImageCaptureError</code> object must set its <code>code</code> value to this constant if an error occurred upon invocation of the <code>takePhoto()</code> method of the <code>ImageCapture</code> interface.</dd> - <dt>const unsigned short ERROR_UNKNOWN=4</dt> - <dd>An <code>ImageCaptureError</code> object must set its <code>code</code> value to this constant if an error occurred due to indeterminate cause upon invocation of any method of the <code>ImageCapture</code> interface.</dd> - <dt>readonly attribute unsigned short code</dt> - <dd>The <code>code</code> attribute returns the appropriate code for the error event, derived from the constants defined in the <code>ImageCaptureError</code> interface.</dd> - <dt>readonly attribute DOMString message</dt> - <dd>The <code>message</code> attribute must return an error message describing the details of the error encountered.</dd> - </dl> - - <section> - <h2><code>MediaSettingsRange</code></h2> - <dl title='interface MediaSettingsRange' class='idl'> - <dt>readonly attribute unsigned long max</dt> - <dd>The maximum value of this setting</dd> - <dt>readonly attribute unsigned long min</dt> - <dd>The minimum value of this setting</dd> - <dt>readonly attribute unsigned long initial</dt> - <dd>The current value of this setting</dd> - </dl> - </section> - - <section> - <h2><code>MediaSettingsItem</code></h2> - <p>The <code>MediaSettingsItem</code> interface is now defined, which allows for a single setting to be managed.</p> - <dl title='interface MediaSettingsItem' class='idl'> - <dt>readonly attribute any value</dt> - <dd>Value of current setting.</dd> - </dl> - <p></p> - </section> - - <section> - <h2><code>PhotoSettingsOptions</code></h2> - <p>The PhotoSettingsOptions attribute of the <code>ImageCapture</code> object provides - the photo-specific settings options and current settings values. The following definitions are assumed - for individual settings and are provided for information purposes:</p> - <ol> - <li><i>White balance mode</i> is a setting that cameras use to adjust for different color temperatures. Color temperature is - the temperature of background light (measured in Kelvin normally). This setting can also be automatically - determined by the implementation. If 'automatic' mode is selected, then the Kelvin setting for White Balance Mode - may be overridden. Typical temprature ranges for different modes are provided below: - <table border="1"> - <tr> - <th>Mode</th> - <th>Kelvin range</th> - </tr> - <tr> - <td>incandescent</td> - <td>2500-3500</td> - </tr> - <tr> - <td>fluorescent</td> - <td>4000-5000</td> - </tr> - <tr> - <td>warm-fluorescent</td> - <td>5000-5500</td> - </tr> - <tr> - <td>daylight</td> - <td>5500-6500</td> - </tr> - <tr> - <td>cloudy-daylight</td> - <td>6500-8000</td> - </tr> - <tr> - <td>twilight</td> - <td>8000-9000</td> - </tr> - <tr> - <td>shade</td> - <td>9000-10000</td> - </tr> - </table> - </li> - <li><i>Exposure</i> is the amount of light allowed to fall on the photographic medium. Auto-exposure mode is a camera setting - where the exposure levels are automatically adjusted by the implementation based on the subject of the photo.</li> - <li><i>Exposure Compensation</i> is a numeric camera setting that adjusts the exposure level from the current value used by the implementation. This value can - be used to bias the exposure level enabled by auto-exposure.</li> - <li>The <i>ISO</i> setting of a camera describes the sensistivity of the camera to light. It is a numeric value, where the lower the value - the greater the sensitivity. This setting in most implementations relates to shutter speed, and is sometimes known as the ASA setting.</li> - <li><i>Red Eye Reduction</i> is a feature in cameras that is designed to limit or prevent the appearance of - red pupils ("Red Eye") in photography subjects due prolonged exposure to a camera's flash.</li> - <li><i>Brightness</i> refers to the numeric camera setting that adjusts the perceived amount of light emitting from the photo object. A higher brightness setting increases the intensity of darker areas in a scene while compressing the intensity of brighter parts of the scene.</li> - <li><i>Contrast</i> is the numeric camera setting that controls the difference in brightness between light and dark areas in a scene. A higher contrast setting reflects an expansion in the difference in brightness.</li> - <li><i>Saturation</i> is a numeric camera setting that controls the intensity of color in a scene (i.e. the amount of gray in the scene). Very low saturation levels will result in photo's closer to black-and-white.</li> - <li><i>Sharpness</i> is a numeric camera setting that controls the intensity of edges in a scene. Higher sharpness settings result in higher edge intensity, while lower settings result in less contrast and blurrier edges (i.e. soft focus).</li> - </ol> - <dl title='interface PhotoSettingsOptions' class='idl'> - <dt>attribute MediaSettingsItem autoWhiteBalanceMode</dt> - <dd>This reflects whether automated White Balance Mode selection is on or off, and is boolean - on is true</dd> - <dt>attribute MediaSettingsRange whiteBalanceMode</dt> - <dd>This reflects the current white balance mode setting. Values are of type <code>WhiteBalanceModeEnum</code>.</dd> - <dt>attribute ExposureMode autoExposureMode</dt> - <dd>This reflects the current auto exposure mode setting. Values are of type <code>ExposureMode</code>.</dd> - <dt>attribute MediaSettingsRange exposureCompensation</dt> - <dd>This reflects the current exposure compensation setting and permitted range. Values are numeric.</dd> - <dt>attribute MediaSettingsRange iso</dt> - <dd>This reflects the current camera ISO setting and permitted range. Values are numeric.</dd> - <dd>This feature reflects the current exposure level for recorded images. Values are numeric.</dd> - <dt>attribute MediaSettingsItem redEyeReduction</dt> - <dd>This reflects whether camera red eye reduction is on or off, and is boolean - on is true</dd> - <dt>attribute MediaSettingsRange brightness</dt> - <dd>This reflects the current brightness setting of the camera and permitted range. Values are numeric.</dd> - <dt>attribute MediaSettingsRange constrast</dt> - <dd>This reflects the current contrast setting of the camera and permitted range. Values are numeric.</dd> - <dt>attribute MediaSettingsRange saturation</dt> - <dd>This reflects the current saturation setting of the camera and permitted range. Values are numeric.</dd> - <dt>attribute MediaSettingsRange sharpness</dt> - <dd>This reflects the current sharpness setting of the camera and permitted range. Values are numeric.</dd> - <dt>attribute MediaSettingsRange imageHeight</dt> - <dd>This reflects the image height range supported by the UA and the current height setting.</dd> - <dt>attribute MediaSettingsRange imageWidth</dt> - <dd>This reflects the image width range supported by the UA and the current width setting.</dd> - </dl> - <p></p> - </section> - - <section> - <h2><code>ExposureMode</code></h2> - <dl title='enum ExposureModeEnum' class='idl'> - <dt>frame-average</dt> - <dd>Average of light information from entire scene</dd> - <dt>center-weighted</dt> - <dd>Sensitivity concentrated towards center of viewfinder</dd> - <dt>spot-metering</dt> - <dd>Spot-centered weighting</dd> - </dl> - </section> - - <section> - <h2><code>PhotoSettings</code></h2> - <p>The <code>PhotoSettings</code> object is optionally passed into the <code>ImageCapture.setOptions()</code> method - in order to modify capture device settings specific to still imagery. Each of the attributes in this object - are optional.</p> - <dl title='dictionary PhotoSettings' class='idl'> - <dt>attribute boolean autoWhiteBalanceMode</dt> - <dd>This reflects whether automatic White Balance Mode selection is desired.</dd> - <dt>attribute unsigned long whiteBalanceMode</dt> - <dd>This reflects the desired white balance mode setting.</dd> - <dt>attribute any autoExposureMode</dt> - <dd>This reflects the desired auto exposure mode setting. Acceptable values are of type <code>ExposureModeEnum</code>.</dd> - <dt>attribute unsigned long exposureCompensation</dt> - <dd>This reflects the desired exposure compensation setting.</dd> - <dt>attribute unsigned long iso</dt> - <dd>This reflects the desired camera ISO setting.</dd> - <dd>This feature reflects the current exposure level for recorded images. Values are numeric.</dd> - <dt>attribute boolean redEyeReduction</dt> - <dd>This reflects whether camera red eye reduction is desired</dd> - <dt>attribute unsigned long brightness</dt> - <dd>This reflects the desired brightness setting of the camera.</dd> - <dt>attribute unsigned long constrast</dt> - <dd>This reflects the desired contrast setting of the camera.</dd> - <dt>attribute unsigned long saturation</dt> - <dd>This reflects the desired saturation setting of the camera.</dd> - <dt>attribute unsigned long sharpness</dt> - <dd>This reflects the desired sharpness setting of the camera.</dd> - <dt>attribute unsigned long imageHeight</dt> - <dd>This reflects the desired image height. The UA <em title="must" class="rfc2119">must</em> select the closest height value this setting if it supports a discrete set of height options. </dd> - <dt>attribute unsigned long imageWidth</dt> - <dd>This reflects the desired image width. The UA <em title="must" class="rfc2119">must</em> select the closest width value this setting if it supports a discrete set of width options.</dd> - </dl> - </section> - - Examples - ------- - ##### Taking a picture if Red Eye Reduction is activated - <pre class='example'> - navigator.getUserMedia({video: true}, gotMedia, failedToGetMedia); - - function gotMedia(mediastream) { - //Extract video track. 'kind' attribute not checked because stream was created with video option only. - var videoDevice = mediastream.getTrackByID()[0]; - // Check if this device supports a picture mode... - var pictureDevice = new ImageCapture(videoDevice); - if (pictureDevice) { - pictureDevice.onphoto = showPicture; - if (pictureDevice.photoSettingsOptions.redEyeReduction) { - pictureDevice.setOptions({redEyeReductionSetting:true}); - } - else - console.log('No red eye reduction'); - pictureDevice.onphotosettingschange = function(){ - if (pictureDevice.photoSettingsOptions.redEyeReduction.value) - pictureDevice.takePhoto(); - } - } - } - - function showPicture(e) { - var img = document.querySelector("img"); - img.src = URL.createObjectURL(e.data); - } - - function failedToGetMedia{ - console.log('Stream failure'); - } - </pre> - - ##### Grabbing a Frame for Post-Processing - <pre class='example'> - navigator.getUserMedia({video: true}, gotMedia, failedToGetMedia); - - function gotMedia(mediastream) { - //Extract video track. 'kind' attribute not checked because stream was created with video option only. - var videoDevice = mediastream.getTrackByID()[0]; - // Check if this device supports a picture mode... - var pictureDevice = new ImageCapture(videoDevice); - if (pictureDevice) { - pictureDevice.onframegrab = processFrame; - pictureDevice.getFrame(); - } - } - - function processFrame(e) { - imgData = e.imageData; - width = imgData.width; - height = imgData.height; - for (j=3; j < imgData.length; j+=4) - { - // Set all alpha values to medium opacity - imgData.data[j] = 128; - } - // Create new ImageObject with the modified pixel values - var canvas = document.createElement('canvas'); - ctx = canvas.getContext("2d"); - newImg = ctx.createImageData(width,height); - for (j=0; j < imgData.length; j++) - { - newImg.data[j] = imgData.data[j]; - } - // ... and do something with the modified image ... - } - - function failedToGetMedia{ - console.log('Stream failure'); - } - </pre> + <p>This document has <a href="http://w3c.github.io/mediacapture-image/">moved to github</a>.</p> </body> </html> diff -r 89dc4a551fce -r 3fcd6e6c0a1f media-stream-capture/MediaRecorder.html --- a/media-stream-capture/MediaRecorder.html Wed Aug 20 11:56:18 2014 +0300 +++ b/media-stream-capture/MediaRecorder.html Thu Aug 28 14:13:32 2014 +0200 @@ -1,522 +1,12 @@ - <!DOCTYPE html> -<html> - - <head> +<!DOCTYPE html> +<html xmlns='http://www.w3.org/1999/xhtml' lang='en'> + <head> + <meta charset='utf-8'/> + <meta http-equiv="refresh" content="0; url=http://w3c.github.io/mediacapture-record/MediaRecorder.html"> + <link rel="canonical" href="http://w3c.github.io/mediacapture-record/MediaRecorder.html"> <title>MediaStream Recording</title> - <meta http-equiv='Content-Type' content='text/html; charset=utf-8'/> - <script type="text/javascript" src='https://www.w3.org/Tools/respec/respec-w3c-common' class='remove'></script> - <script type="text/javascript" class='remove'> - var respecConfig = { - specStatus: "ED", - editors: [{ - name: "Jim Barnett", - company: "Genesys", - companyURL: "http://www.genesyslab.com"}, - { name: "Travis Leithead", - company: "Microsoft Corp.", - url: "mailto:travis.leithead@microsoft.com?subject=MediaStream Capture Scenarios Feedback", - companyURL: "http://www.microsoft.com"}], - publishDate: "2013-11-08", - edDraftURI: "http://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/RecordingProposal.html", - previousPublishDate: null, - prevED: "", - noIDLIn: true, - inlineCSS: true, - noRecTrack: true, - shortName: "recording", - extraCSS: ["http://dev.w3.org/2009/dap/ReSpec.js/css/respec.css"], - wg: ["Device APIs Working Group", "Web Real-Time Communications Working Group"], - wgURI: ["http://www.w3.org/2009/dap/", "http://www.w3.org/2011/04/webrtc/"], - wgPublicList: "public-media-capture", - wgPatentURI: ["http://www.w3.org/2004/01/pp-impl/43696/status", "http://www.w3.org/2004/01/pp-impl/47318/status"] - }; - </script> - <style type="text/css"> - /* ReSpec.js CSS optimizations (Richard Tibbett) - cut-n-paste :) */ - div.example { - border-top: 1px solid #ff4500; - border-bottom: 1px solid #ff4500; - background: #fff; - padding: 1em; - font-size: 0.9em; - margin-top: 1em; - } - div.example::before { - content: "Example"; - display: block; - width: 150px; - background: #ff4500; - color: #fff; - font-family: initial; - padding: 3px; - padding-left: 5px; - font-weight: bold; - margin: -1em 0 1em -1em; - } - - /* Clean up pre.idl */ - pre.idl::before { - font-size:0.9em; - } - - /* Add better spacing to sections */ - section, .section { - margin-bottom: 2em; - } - - /* Reduce note & issue render size */ - .note, .issue { - font-size:0.8em; - } - - /* Add addition spacing to <ol> and <ul> for rule definition */ - ol.rule li, ul.rule li { - padding:0.2em; - } - </style> </head> - - <body> - <section id="abstract"> - <p>This document defines a recording API for use with MediaStreams as defined in - [[!GETUSERMEDIA]]</p></section> - - <section id="sotd"> - - - - <p> - <em>This section describes the status of this document at the time of its publication. Other - documents may supersede this document. A list of current <acronym title="World Wide Web Consortium">W3C</acronym> publications and the latest revision - of this technical report can be found in the <a href="http://www.w3.org/TR/"><acronym title="World Wide Web Consortium">W3C</acronym> technical reports - index</a> at http://www.w3.org/TR/.</em> - </p> - - <p>This document is not complete. It is subject to major changes and, while - early experimentations are encouraged, it is therefore not intended for - implementation. - The Media Capture Task Force expects this specification to evolve - significantly based on:</p> - - <ul> - <li>Privacy issues that arise when capturing media.</li> - - <li>Technical discussions within the task force.</li> - - <li>Experience gained through early experimentations.</li> - - <li>Feedback received from other groups and individuals.</li> - </ul> - - <p> - This document was published by the <a href="http://www.w3.org/2011/04/webrtc/">Web Real-Time Communication Working Group</a> as an Editor's Draft. - - If you wish to make comments regarding this document, please send them to - <a href="mailto:public-media-capture@w3.org">public-media-capture@w3.org</a> - (<a href="mailto:public-media-capture-request@w3.org?subject=subscribe">subscribe</a>, - <a href="http://lists.w3.org/Archives/Public/public-media-capture/">archives</a>). - - - All feedback is welcome. - </p> - - <p> - Publication as an Editor's Draft does not imply endorsement by the <acronym title="World Wide Web Consortium">W3C</acronym> Membership. - This is a draft document and may be updated, replaced or obsoleted by other documents at - any time. It is inappropriate to cite this document as other than work in progress. - </p> - - - <p> - - This document was produced by a group operating under the - <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5 February 2004 <acronym title="World Wide Web Consortium">W3C</acronym> Patent Policy</a>. - - - - - <acronym title="World Wide Web Consortium">W3C</acronym> maintains a public list of any patent disclosures (<a href="http://www.w3.org/2004/01/pp-impl/47318/status" rel="disclosure">Web Real-Time Communication Working Group</a>, <a href="http://www.w3.org/2004/01/pp-impl/43696/status" rel="disclosure">Device APIs Working Group</a>) - - made in connection with the deliverables of the group; that page also includes instructions for - disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains - <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#def-essential">Essential Claim(s)</a> must disclose the - information in accordance with <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">section - 6 of the <acronym title="World Wide Web Consortium">W3C</acronym> Patent Policy</a>. - - - </p> - - - - -</section> - - -<section id="overview"><h2>Overview</h2> - <p>This API attempts to make basic recording very simple, while still allowing for more complex use cases. In the simplest case, - the application instatiates the MediaRecorder object, calls record() and then calls stop() or waits for the MediaStream to be ended. The contents of the recording - will be made available in the platform's default encoding via the dataavailable event. Functions are available to query - the platform's available set of encodings, and to select the desired ones if the author wishes. The application can also choose - how much data it wants to receive at one time. By default a Blob containing the entire recording is returned when - the recording finishes. However the application can choose to receive smaller buffers of data at regular intervals. </p> - </section> - - - - - <section id="MediaRecorderAPI"><h2>Media Recorder API</h2> - - <div class="idl" title="MediaRecorder implements Constrainable"></div> - <p>For the definition of the Constrainable API, see [[!GETUSERMEDIA]]. For the - definition of MediaRecorder's constrainable properties see <a href="#properties">MediaRecorder - Constrainable Properties</a>.</p> - -<dl title='[Constructor (MediaStream stream)] interface MediaRecorder : EventTarget' class="idl"> - <dt>readonly attribute MediaStream stream</dt><dd>The MediaStream passed in to the constructor.</dd> - <dt>readonly attribute RecordingStateEnum state</dt><dd>The current state of the MediaRecorder object. </dd> - <dt>attribute EventHandler onstart</dt><dd>Called to handle the start event.</dd> - <dt>attribute EventHandler onstop</dt><dd>Called to handle the stop event.</dd> - <dt>attribute EventHandler ondataavailable</dt><dd>Called to handle the dataavailable event. Note that the Blob (see [[!FILE-API]]) of recorded data is contained in this event and can - be accessed via the 'data' attribute.</dd> - <dt>attribute EventHandler onpause</dt><dd>Called to handle the pause event. </dd> - <dt>attribute EventHandler onresume</dt><dd>Called to handle the resume event. </dd> - <dt>attribute EventHandler onerror</dt><dd>Called to handle the DOMError event. </dd> - <dt>attribute EventHandler onwarning</dt><dd>Called to handle the recordingwarning event. </dd> - <dt>void start()</dt> -<dd>When a <code>MediaRecorder</code> object’s <code>start()</code> method is invoked, - the UA <em title="must" class="rfc2119">must</em> queue a task, using the DOM manipulation task source, - that runs the following steps: - <ol> - <li>If the <code>state</code> is not "inactive", raise a DOM <code>InvalidState</code> error and terminate these steps. Otherwise:</li> - <li>Set <code>state</code> to 'recording' and wait until media - - becomes available from <code>stream</code>.</li> - <li>Once data becomes available raise a <code>start</code> event and start gathering the - - data into a Blob (see [[!FILE-API]]). </li> - <li>If the <code>timeSlice</code> argument has been provided, then once <code>timeSlice</code> - milliseconds of data have been colleced, or some minimum time slice imposed - by the UA, whichever is greater, raise a <code>dataavailable</code> event containing - the Blob of collected data, and start gathering a new Blob of data. Otherwise (if <code>timeSlice</code> - has not been provided), continue gathering data into the original Blob.</li> - <li>When the <code>stream</code> is ended set <code>recording</code> - to 'false' and stop gathering data. . Callers SHOULD not rely on exactness of the timeSlice value, especially if the timeSlice value is small. Callers SHOULD consider timeSlice as a minimum value</li> - <li>Then raise a <code>dataavailable</code> event containing the Blob of data.</li> - <li>Finally, raise a <code>stop</code> event.</li> - </ol> - - <p>Note that <code>stop()</code>, - <code>requestData()</code>, and <code>pause</code> also affect the recording behavior.</p> - - - <p>The UA <em title="must" class="rfc2119">must</em> record the MediaStream - in such a way that the original Tracks can be retrieved at playback time. When multiple Blobs - are returned (because of <code>timeSlice</code> or <code>requestData</code>), the individual - Blobs need not be playable, but the combination of all the Blobs from a completed recording <em title="must" class="rfc2119">must</em> - be playable. If any Track within the - MediaStream is muted at any time (i.e., if its <code>readyState</code> is set to <code>muted</code>), the UA - <em title="must" class="rfc2119">must</em> insert black frames or silence until the Track is unmuted. If the UA is - unable to start recording or at any point is unable to contine recording, it <em title="must" class="rfc2119">must</em> raise - a <code>DOMError</code> event, followed by a <code>dataavailable</code> event containing - the Blob it has gathered, follwed by the <code>stop</code> event. - </p> - <dl class='parameters'> - <dt>optional long timeslice</dt> - <dd> - The number of milliseconds of data to return in a single Blob. - </dd> - </dl> - </dd> - - <dt>void stop()</dt> - - <dd>When a <code>MediaRecorder</code> object’s <code>stop</code> method is invoked, - the UA <em title="must" class="rfc2119">must</em> - queue a task, using the DOM manipulation task source, that runs the following steps: - <ol> - <li>If <code>state</code> is "inactive", raise a DOM <code>InvalideStateError</code>event and terminate these steps. - Otherwise:</li> - <li>Set <code>state</code> to 'inactive' and stop gathering data. </li> - <li>Raise a <code>dataavailable</code> event containing the Blob of data that has been gathered.</li> - <li>Raise a <code>stop</code> event</li> - </ol> - </dd> - <dt>void pause()</dt> - <dd>When a <code>MediaRecorder</code> object’s <code>pause()</code>method is invoked, - the UA <em title="must" class="rfc2119">must</em> - queue a task, using the DOM manipulation task source, that runs the following steps: - <ol> - <li>If <code>state</code> is "inactive" - raise a DOM - <code>InvalidState</code> error and terminate these steps. Otherwise: </li> - <li>Set <code>state</code> to "paused".</li> - <li>Stop gathering data into its current Blob (but keep the Blob available so that - recording can be resumed in the future).</li> - <li>Raise a <code>pause</code> event </li> - </ol> - </dd> - <dt>void resume()</dt> - <dd>When a <code>MediaRecorder</code> object’s <code>resume()</code> method is invoked, - the UA <em title="must" class="rfc2119">must</em> - queue a task, using the DOM manipulation task source, that runs the following steps: - <ol> - <li>If <code>state</code> is "inactive" - raise a DOM <code>InvalidState</code> - error and terminate these steps. Otherwise: </li> - <li>Set <code>state</code> to "recording".</li> - <li>Resume (or continue) gathering data into its current Blob.</li> - <li>Raise a <code>resume</code> event.</li> - </ol></dd> - - <dt>void requestData()</dt> - <dd> - When a <code>MediaRecorder</code>object’s <code>requestData()</code> method is invoked, - the UA <em title="must" class="rfc2119">must</em> - queue a task, using the DOM manipulation task source, that runs the following steps: - <ol> - <li>If <code>state</code> is not "recording" - raise a DOM - <code>InvalidState</code> error and terminate these steps. - Otherwise:</li> - <li>Raise a <code>dataavailable</code> event containing the current Blob of saved data. (Note that this Blob - will be empty if no data has been gathered yet.)</li> - <li>Create a new Blob and gather subsequent data into it.</li> - </ol> - </dd> - -</dl> - - - - -<p>The <code>MediaRecorder()</code> - constructor takes one argument which - <em title="must" class="rfc2119">must</em> be of type <code>MediaStream</code> (see [[!GETUSERMEDIA]]). When the constructor - is invoked, the UA <em title="must" class="rfc2119">must</em> construct a new <a class="idlType" href="#idl-def-MediaRecorder"><code>MediaRecorder</a></code> object, - set its <a href="#widl-mediastream">mediaStream</a> attribute to be the provided MediaStream, set its <a href="#widl-state">state</a> attribute to 'inactive' - and return the object.</p> - - - - -<section id="RecordingStateEnum"> - <h3>RecordingStateEnum</h3> - <dl title="enum RecordingStateEnum" class="idl"> - <dt>inactive</dt> - <dd>Recording is not occuring. (Either it has not been started or it has been stopped.).</dd> - <dt>recording</dt> - <dd>Recording has been started and the UA is capturing data..</dd> - <dt>paused</dt> - <dd>Recording has been started, then paused, and not yet stopped or resumed.</dd> </dl> - - </section> -</section> - -<section id="blob-event"> - <h2>Blob Event</h2> - - <dl title='[Constructor] interface BlobEvent : Event' class='idl'> - <dt>readonly attribute Blob data</dt> - <dd> - Returns a Blob object whose type attribute indicates the encoding of the blob data. - </dd> -</dl> - - - <h3>BlobEventInit</h3> - <dl title='dictionary BlobEventInit' class='idl'> - <dt>Blob data</dt> - <dd> - A Blob object containing the data to deliver via this event. - </dd> - </dl> -</section> - -<section id="properties"> - <h3>MediaRecorder Constrainable Properties</h3> - <p>IANA is requested to register the following properties as specified - in [[!RTCWEB-CONSTRAINTS]]:</p> - <table class="simple"> - <thead> - <tr> - <th>Property Name</th> - <th>Values</th> - <th>Notes</th> - </tr> - </thead> - <tbody> - <tr id="def-property-MimeType"> - <td>MimeType</td> - <td><a><code>list of DOMString</code></a></td> - <td> The MIME types that can be selected as encodings for recording. The UA should be able to play back any of the MIME types it supports for recording. For example, it should be able to display a video recording in the HTML <img> tag.</td> - </tr> - <tr id="def-property-imageHeight"> - <td>imageHeight</td> - <td><code>PropertyValueRange</code></td> - <td>The maximum and minimum height, in pixels, for the capture of videomages</td> - </tr> - - <tr id="def-property-imageWidth"> - <td>imageWidth</td> - <td><code>PropertyValueRange</code></td> - <td>The maximum and minimum width, in pixels, for the capture of videomages</td> - </tr> - - - - </tbody> - </table> - </section> - - - <section id="error-handling"> - <h2>Error Handling</h2> - - <section id="general-principles"> - <h3>General Principles</h3> - - <p>Errors are indicated in two ways: exceptions and objects passed to - error callbacks. In the former case, a <a href="http://www.w3.org/TR/2012/WD-dom-20121206/#exception-domexception">DOMException</a> is raised (see [[!DOM4]]). - An exception <em title="must" class="rfc2119">must</em> be thrown when the - error can be detected at the time that the call is made. In all other cases, an <a href="http://www.w3.org/TR/2012/WD-dom-20121206/#interface-domerror">DOMError</a> object(see [[!DOM4]]) <em title="must" class="rfc2119">must</em> be provided to the failure - callback. The error name in the object provided <em title="must" class="rfc2119">must</em> be picked from - the <code>RecordingErrorName</code> - enums. If recording has been started and not yet stopped when the error occurs, then after raising the error, the UA <em title="must" class="rfc2119">must</em> - raise a dataavailable event, containing any data that it has gathered, - and then a stop event. The UA <em title="may" class="rfc2119">may</em> set platform-specific - limits, such those for the minimum and maximum Blob size that it will support, or the number of - Tracks it will record at once. It <em title="must" class="rfc2119">must</em> signal a fatal - error if these limits are exceeded. If a non-fatal error occurs during recording, the UA - <em title="should" class="rfc2119">should</em> raise a recordingwarning event, with data indicating - the nature of the problem, and continue recording.</p> - </section> - - - - -<section> - <h4>RecordingErrorNameEnum</h4> - <dl title="enum RecordingErrorNameEnum" class="idl"> - <dt>InvalidState</dt> - <dd>The <code>MediaRecorder</code> is not in a state - in which the proposed operation is allowed - to be executed.</dd> - <dt>OutOfMemory</dt> - <dd>The UA has exhaused the available memory. User agents SHOULD provide as much additional information as possible in the <code>message</code> attribute.</dd> - <dt>IllegalStreamModification</dt> - <dd>A modification to the <code>stream</code> has occurred that makes it impossible to continue recording. An example would be the addition of a Track while recording is occurring. - User agents SHOULD provide as much additional information as possible in the <code>message</code> attribute.</dd> - <dt>OtherRecordingError</dt> - <dd>Used for an fatal error other than those listed above. User agents SHOULD provide as much additional information as possible in the <code>message</code> attribute.</dd> - - </section> -</section> -<section id="recordingexceptionenum"> - <h2>RecordingExceptionEnum</h2> - <dl title="enum RecordingExceptionEnum" class="idl"> - - <dt>InvalidMediaStreamTrackId</dt> - <dd> The argument provided is not the ID of any <code>MediaStreamTrack</code> belonging to the <code>MediaRecorder</code>'s <code>stream</code>.</dd> - <dt>UnsupportedOption</dt> - <dd>The UA cannot provide the codec or recording option that has - been requested.</dd> - </section> - -</section> - <section id="event-summary" class="informative"> - <h2>Event summary</h2> - - - <p>The following additional events fire on <code><a class="idlType" href="#idl-def-MediaRecorder"><code>MediaRecorder</code></a></code> - objects:</p> - - <table style="border-width:0; width:60%" border="1"> - <tbody><tr> - <th>Event name</th> - - <th>Interface</th> - - <th>Fired when...</th> - </tr> - - </tbody><tbody> - <tr> - <td><dfn id="event-mediarecorder-start"><code>start</code></dfn></td> - - <td><code>Event</code></td> - - <td>The UA has started recording data on the MediaStream.</td> - </tr> - - <tr> - <td><dfn id="event-mediarecorder-stop"><code>stop</code></dfn></td> - - <td><code>Event</code></td> - - <td>The UA has stopped recording data on the MediaStream.</td> - </tr> - - <tr> - <td><dfn id="event-mediarecorder-dataavailable"><code>dataavailable</code></dfn></td> - - <td><a href="#idl-def-BlobEvent"><code>BlobEvent</code></a></td> - <td>The UA generates this even to return data to the application. The 'data' attribute of this - event contains a Blob of recorded data.</td> - </tr> - - - <tr> - <td><dfn id="event-mediarecorder-pause"><code>pause</code></dfn></td> - - <td><code>Event</code></td> - - <td>The UA has paused recording data on the MediaStream.</td> - </tr> - - <tr> - <td><dfn id="event-mediarecorder-resume"><code>resume</code></dfn></td> - - <td><code>Event</code></td> - - <td>The UA has resumed recording data on the MediaStream.</td> - </tr> - - <tr> - <td><dfn id="event-mediarecorder-DOMError"><code>DOMError</code></dfn></td> - - <td><a href="http://www.w3.org/TR/2012/WD-dom-20121206/#interface-domerror"><code>DOMError</code></a></td> - - <td>A fatal error has occurred and the UA has stopped recording. More detailed error information - is available in the 'message' attribute. </td> - </tr> - - <tr> - <td><dfn id="event-mediarecorder-warning"><code>warning</code></dfn></td> - - <td><code>CustomEvent</code></td> - - <td>A problem has occurred, but the UA has not stopped recording. More detailed information - is available in the 'message' attribute. [Not clear if we need this event.]</td> - </tr> - - </tbody> - </table> - - </section> - - - <section class="appendix" id="openissues"><!--OddPage--><h2>Open Issues</h2> - <ol> - <li>Do we need an MTI - format?</li> - <li>Do we need a "setSyncPoint()" operator and a "syncpoint" signal, - so that the client can tell the recorder to insert a point at - which a recording can be broken up (typically a new I-frame)? </li> - <li>Do we need warning events? </li> - <li>Do we need to ask the user's permission before we record?</li> - </ol> -</section> - - - </body></html> + <p>This document has <a href="http://w3c.github.io/mediacapture-record/MediaRecorder.html">moved to github</a>.</p> + </body> +</html> diff -r 89dc4a551fce -r 3fcd6e6c0a1f media-stream-capture/scenarios.html --- a/media-stream-capture/scenarios.html Wed Aug 20 11:56:18 2014 +0300 +++ b/media-stream-capture/scenarios.html Thu Aug 28 14:13:32 2014 +0200 @@ -1,1134 +1,12 @@ <!DOCTYPE html> -<html> +<html xmlns='http://www.w3.org/1999/xhtml' lang='en'> <head> + <meta charset='utf-8'/> + <meta http-equiv="refresh" content="0; url=http://w3c.github.io/mediacapture-scenarios/scenarios.html"> + <link rel="canonical" href="http://w3c.github.io/mediacapture-scenarios/scenarios.html"> <title>MediaStream Capture Scenarios</title> - <meta http-equiv='Content-Type' content='text/html; charset=utf-8'/> - <script type="text/javascript" src='https://www.w3.org/Tools/respec/respec-w3c-common' class='remove' async></script> - <script type="text/javascript" class='remove'> - var respecConfig = { - specStatus: "ED", - editors: [{ - name: "Travis Leithead", - company: "Microsoft Corp.", - url: "mailto:travis.leithead@microsoft.com?subject=MediaStream Capture Scenarios Feedback", - companyURL: "http://www.microsoft.com"}], - publishDate: "2013-01-04", - edDraftURI: "http://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/scenarios.html", - previousPublishDate: null, - prevED: "http://dvcs.w3.org/hg/dap/raw-file/a8e255b904df/media-stream-capture/scenarios.html", - noIDLIn: true, - inlineCSS: true, - noRecTrack: true, - shortName: "capture-scenarios", - extraCSS: ["http://dev.w3.org/2009/dap/ReSpec.js/css/respec.css"], - wg: ["Device APIs Working Group", "Web Real-Time Communications Working Group"], - wgURI: ["http://www.w3.org/2009/dap/", "http://www.w3.org/2011/04/webrtc/"], - wgPublicList: "public-media-capture", - wgPatentURI: ["http://www.w3.org/2004/01/pp-impl/43696/status", "http://www.w3.org/2004/01/pp-impl/47318/status"] - }; - </script> - <style type="text/css"> - /* ReSpec.js CSS optimizations (Richard Tibbett) - cut-n-paste :) */ - div.example { - border-top: 1px solid #ff4500; - border-bottom: 1px solid #ff4500; - background: #fff; - padding: 1em; - font-size: 0.9em; - margin-top: 1em; - } - div.example::before { - content: "Example"; - display: block; - width: 150px; - background: #ff4500; - color: #fff; - font-family: initial; - padding: 3px; - padding-left: 5px; - font-weight: bold; - margin: -1em 0 1em -1em; - } - - /* Clean up pre.idl */ - pre.idl::before { - font-size:0.9em; - } - - /* Add better spacing to sections */ - section, .section { - margin-bottom: 2em; - } - - /* Reduce note & issue render size */ - .note, .issue { - font-size:0.8em; - } - - /* Add addition spacing to <ol> and <ul> for rule definition */ - ol.rule li, ul.rule li { - padding:0.2em; - } - </style> </head> - <body> - <section id='abstract'> - <p> - This document collates the target scenarios for the Media Capture task force. Scenarios represent - the set of expected functionality that may be achieved by the use of the MediaStream Capture API. A set of - un-supported scenarios may also be documented here. - </p> - <p>This document builds on the assumption that the mechanism for obtaining fundamental access to local media - capture device(s) is <code>navigator.getUserMedia</code> (name/behavior subject to this task force), and that - the vehicle for delivery of the content from the local media capture device(s) is a <code>MediaStream</code>. - Hence the title of this note. - </p> - </section> - - <section id="sotd"> - <p> - This document is intended to represent the consensus of the media capture task force on the set of scenarios - supported by the MediaStream Capture API. It will eventually be released as a Note. - </p> - </section> - - - <section class="informative"> - <h2>Introduction</h2> - <p> - One of the goals of the joint task force between the Device and Policy working group and the Web Real Time - Communications working groups is to bring media capture scenarios from both groups together into one unified - API that can address all relevant use cases. - </p> - <p> - The capture scenarios from WebRTC are primarily driven from real-time-communication-based scenarios, such as - capturing live chats, teleconferences, and other media streamed from over the network from potentially - multiple sources. - </p> - <p> - The capture scenarios from DAP represent "local" capture scenarios that providing access to a user agent's - camera and other related experiences. - </p> - <p> - Both groups include overlapping chartered deliverables in this space. Namely in DAP, - <a href="http://www.w3.org/2011/07/DeviceAPICharter">the charter specifies a recommendation-track deliverable</a>: - <ul> - <li> - <dl> - <dt>Capture API</dt> - <dd>an API to manage a device's camera and microphone, e.g. to take a picture or record a sound.</dd> - </dl> - </li> - </ul> - </p> - <p> - And <a href="http://www.w3.org/2011/04/webrtc-charter.html">WebRTC's charter scope</a> describes enabling - real-time communications between web browsers that will require specific client-side technologies: - <ul> - <li>API functions to explore device capabilities, e.g. camera, microphone, speakers (currently in scope - for the <a href="http://www.w3.org/2009/dap/">Device APIs & Policy Working Group</a>)</li> - <li>API functions to capture media from local devices (camera and microphone) (currently in scope for the - <a href="http://www.w3.org/2009/dap/">Device APIs & Policy Working Group</a>)</li> - <li>API functions for encoding and other processing of those media streams,</li> - <li>API functions for decoding and processing (including echo cancelling, stream synchronization and a - number of other functions) of those streams at the incoming end,</li> - <li>Delivery to the user of those media streams via local screens and audio output devices (partially - covered with HTML5)</li> - </ul> - </p> - <p> - Note, that the scenarios described in this document specifically exclude declarative capture scenarios, such as those where media capture can be - obtained and submitted to a server entirely without the use of script. Such scenarios generally involve the use - of a UA-specific app or mode for interacting with the capture device, altering settings and completing the - capture. Such scenarios are currently captured by the DAP working group's <a href="http://dev.w3.org/2009/dap/camera/">HTML Media Capture</a> - specification. - </p> - <p> - The scenarios contained in this document are specific to scenarios in which web applications require direct access - to the capture device, its settings, and the capture mechanism and output. Such scenarios are - crucial to building applications that can create a site-specific look-and-feel to the user's interaction with the - capture device, as well as utilize advanced functionality that may not be available in a declarative model. - </p> - <p> - Some of the scenarios described in this document may overlap existing - <a href="http://tools.ietf.org/html/draft-ietf-rtcweb-use-cases-and-requirements-06">usage scenarios</a> - defined by the <a href="http://datatracker.ietf.org/wg/rtcweb/">IETF RTCWEB Working Group</a>. This document - is specifically focused on the capture aspects of media streams, while the linked document is geared toward - networking and peer-to-peer RTC scenarios. - </p> - </section> - <section> - <h2>Concepts and Definitions</h2> - <p> - This section describes some terminology and concepts that frame an understanding of the design considerations - that follow. It is helpful to have a common understanding of some core concepts to ensure that the prose is - interpreted uniformly. - </p> - <dl> - <dt> Media Stream and Stream</dt> - <dd>This document uses these terms interchangeably. They are intended to be - a generalization of the more specific <code>MediaStream</code> interface as currently defined in the - WebRTC spec. Generally, a stream can be understood as a tube or conduit between sources (the stream's - generators) and destinations (the sinks). Streams don't generally include any type of significant buffer, that is, - content pushed into the stream from a source does not collect into any buffer for later collection. Rather, content - is simply dropped on the floor if the stream is not connected to a sink. The content flowing through a media - stream is not in any particular underlying format. (Note that the - the WebRTC specification makes a similar assertion about the content flowing - through a <code>MediaStream</code>.) - </dd> - <dt>Media Capture versus Recording</dt> - <dd>This document uses 'media capture' to refer to the process of obtaining a stream of data - from a device. As noted above, that data is not assumed to be in any particular format. - 'Recording', on the other hand, refers to the capture of media under application control - and in a specific, known, format. Such data can be written to a local file or sent to a - remote destination.</dd> - <dt>Shared devices, devices with manipulatable state, and virtualization</dt> - <dd> - <p>A shared device (in this document) is a media device (camera or microphone) that is usable by more than - one application at a time. When considering sharing a device (or not), an operating system must evaluate - whether applications consuming the device will have the ability to manipulate the state of the device. A shared device - with manipulatable state has the side-effect of allowing one application to make changes to a device that will then - affect other applications who are also sharing. - </p> - <p>To avoid these effects and unexpected state changes in applications, operating systems may virtualize a - device. Device virtualization (in a simplistic view) is an abstraction of the actual device, so that the abstraction - is provided to the application rather than providing the actual device. When an application manipulates the state - of the virtualized device, changes occur only in the virtualized layer, and do not affect other applications that - may be sharing the device. - </p> - <p>Audio devices are commonly virtualized. This allows many applications to share the audio device and manipulate its - state (e.g., apply different input volume levels) without affecting other applications. - </p> - <p>Video virtualization is more challenging and not as common. For example, the Microsoft Windows operating system does - not virtualize webcam devices, and thus chooses not to share the webcam between applications. As a result, in order - for an application to use the webcam either 1) another application already using the webcam must yield it up or 2) - the requesting application may be allowed to "steal" the device. - </p> - </dd> - </dl> - </p> - </section> - <section> - <h2>Scenarios</h2> - <p> - In this section, scenarios are presented first as a story that puts the scenario into perspective, and then - as a list of specific capture scenarios included in the story. - </p> - - <section> - <h3>"Check out this new hat!" (photo upload with audio caption)</h3> - <p> - Amy logs in to her favorite social networking page. She wants to tell her friends about a new hat she recently - bought for an upcoming school play. She clicks a "select photo" drop-down widget on the site, and choses the - "from webcam" option. A blank video box appears on the site followed by a notice from the browser to "allow the - use of the webcam". She approves it (just like she did the last time she used her camera on this site), and immediately sees her own image as viewed by her webcam. She then hears - an audio countdown starting from "3", giving her time to adjust herself in the video frame so that her hat is - clearly visible. After the countdown reaches "0", the captured image is displayed along with some controls with - which to resize/crop the image. She crops the image so that it just showcases her hat. She then clicks a button - allowing her to record an "audio caption". A small box with an audio meter appears, immediately followed by - another prompt from her browser to "allow the use of the microphone". After approving it, she sees an indicator - showing that the microphone is listening, and then begins describing the features of her new hat. While she - speaks she sees that the microphone is picking up her voice because the audio meter is reacting to her voice. - She stops talking and after a moment the web page asks her to confirm that she's done with her caption. She - confirms that she is finished, and then clicks on "check in" which uploads her new picture and audio caption to - the social networking site's server. - </p> - <p>Requirements: <a href="#p1">P1</a>, <a href="#p1">P2</a>, <a href="#lm1">LM1</a>, - <a href="#lm2">LM2</a>,<a href="#lm3">LM3</a>,<a href="#LM12">LM12</a>, <a href="#mc1">MC1</a>, - <a href="#mc2">MC2</a>.</p> - - - <section> - <h4>Variations</h4> - <p>TBD</p> - </section> - </section> - - <section> - <h3>Election podcast and commentary (video capture and chat)</h3> - <p> - Every Wednesday at 6:45pm, Adam logs into his video podcast web site for his scheduled 7pm half-hour broadcast - "commentary on the US election campaign". These podcasts are available to all his subscribers the next day, but - a few of his friends tune-in at 7 to listen to the podcast live. Adam selects the "prepare podcast" option, - is notified by the browser that he previously approved access to his webcam and microphone, and situates himself in front of the - webcam, using the "self-view" video window on the site. While waiting for 7pm to arrive, the video podcast site - indicates that two of his close friends are now online. He approves their request to listen live to the podcast. - Finally, at 7pm he selects "start podcast" and launches into his commentary. While capturing locally, Adam switches - between several tabs in his browser to quote from web sites representing differing political views. Half-hour later, he wraps up his - concluding remarks, and opens the discussion up for comments. One of his friends has a comment, but has - requested anonymity, since the comments on the show are also recorded. Adam enables the audio-only setting for - that friend and directs him to share his comment. In response to the first comment another of Adam's friends - wants to respond. This friend has not requested anonymity, and so Adam enables the audio/video mode for that - friend, and hears the rebuttal. After a few back-and-forths, Adam sees that his half-hour is up, thanks his - audience, and clicks "end podcast". A few moments later that site reports that the podcast has been uploaded. - </p> - <p>Requirements: <a href="#p1">P3</a>,<a href="#p4">P4</a>, <a href="#lm1">LM1</a>, - <a href="#lm5">LM5</a> - <a href="#rm2">RM2</a>, <a href="#rm3">RM3</a>,<a href="#rm4">RM4</a>, <a href="#mc3">MC3</a>, - <a href="#mc6">MC6</a>. - </p> - - <section> - <h4>Variations</h4> - <p>TBD</p> - </section> - </section> - - <section> - <h3>Find the ball assignment (media processing and recording)</h3> - <p> - Alice is finishing up a college on-line course on image processing, and for the assignment she has to write - code that finds a blue ball in each video frame and draws a box around it. She has just finished testing her - code in the browser using her webcam to provide the input and the canvas element to draw the box around each - frame of the video input. To finish the assignment, she must upload a video to the assignment page, which - requires uploads to have a specific encoding (to make it easier for the TA to review and grade all the - videos) and to be no larger than 50MB (small camera resolutions are recommended) and no longer than 30 - seconds. Alice is now ready; she enables the webcam, a video preview (to see herself and the ball with the box around it), changes the camera's resolution down to 320x200, starts a video capture along with her media processing code, and holds up the blue ball, moving it around. As she moves the ball, her code processes each video frame, drawing the box around the ball. The video preview shows output of her code (namely herself with the box tracking the ball) so that she sees that it is working correctly. After recording the output of her processing code for 30 seconds, Alice stops the recording and immediately uploads the recorded video to the assignment upload page using her class account. - </p> - <p>Requirements:<a href="#lm1">LM1</a>, <a href="#lm6">LM6</a>, <a href="#lm7">LM7</a>, - <a href="#mc1">MC1</a>, - <a href="#mc4">MC4</a>, <a href="#mc5">MC5</a>. </p> - - <section> - <h4>Variations</h4> - <section></section> - <h5>Recording with post-processing</h5> - <p>Alice decides to run her image-tracking code as a post-processing step. She enables the webcam, a video preview (to see herself and the ball), changes the camera's resolution down to 320x200, starts a video recording, and holds up the blue ball, moving it around. As she does this, the UA records the video stream of her and the ball. After 30 seconds, she terminates the recording and saves the result to a file. She then runs her image-processing software on the saved file, producing a new file that shows the box drawn around the moving ball. She then previews the processed file to make sure it's correct, and uploads it to the assignment page using her class account. </p> - </section> - </section> - </section> - - <section> - <h3>Video diary at the Coliseum (multiple webcams and error handling)</h3> - <p> - Albert is on vacation in Italy. He has a device with a front and rear webcam, and a web application that lets - him document his trip by way of a video diary. After arriving at the Coliseum, he launches his video diary - app. There is no internet connection to his device. The app asks Albert which of his microphones and - webcams he'd like to use, and he activates both webcams (front and rear). Two video elements appear side-by-side - in the app. Albert uses his device to capture a few still shots of the Coliseum using the rear camera, then - starts recording a video, selecting the front-facing webcam to begin explaining where he is. While talking, - he selects the rear-facing webcam to capture a video of the Coliseum (without having to turn his device - around), and then switches back to the front-facing camera to continue checking in for his diary entry. - Albert has a lot to say about the Coliseum, but before finishing, his device warns him that the battery is - about to expire. At the same time, the device shuts down the cameras and microphones to conserve battery power. - Later, after plugging in his device at a coffee shop, Albert returns to his diary app and notes that his - recording from the Coliseum was saved. - </p> - <p>Requirements: <a href="#p2">P2</a>, <a href="#lm9">LM9</a>, - <a href="#mc1">MC1</a>, <a href="#mc6">MC6</a>, <a href="#mc7">MC8</a>.</p> - - - <section> - <h4>Variations</h4> - <section> - <h5>Recording a sports event (simultaneous capture from multiple webcams)</h5> - <p>Albert's day job is a sports commentator. He works for a local television station and records the local - hockey games at various schools. Albert uses a web-based front-end on custom hardware that allows him to connect - three cameras covering various angles of the game and a microphone with which he is running the commentary. - The application records all of these cameras at once. After the game, Albert prepares the game highlights. He - likes to highlight great plays by showing them from multiple angles. The final composited video is shown on the - evening news. - </p> - <p>Requirements: <a href="#mc9">MC9</a>.</p> - - </section> - <section> - <h5>Picture-in-picture (capture a composed video)</h5> - <p>While still on his Italy vacation, Albert hears that the Pope might make a public appearance at the vatican. Albert - arrives early to claim a spot, and starts his video diary. He activates both front and rear cameras so that he can - capture both himself and the camera's view. He then sets up the view in his video diary so that the front-facing camera - displays in a small frame contained in one corner of the larger rear-facing camera's view rectangle (picture-in-picture). - Albert excitely describes the sense of the crowd around him while simultaneously capturing the Pope's appearance. Afterward, - Albert is happy that he didn't miss the moment by having to switch between cameras. - </p> - <p>Requirements: <a href="#lm9">LM9</a>, <a href="#lmc10">MC10</a>.</p> - - </section> - </section> - </section> - - <section> - <h3>Conference call product debate (multiple conversations and capture review)</h3> - <p> - As part of a routine business video conference call, Amanda initiates a connection to the five other field - agents in her company via the company's video call web site. Amanda is the designated scribe and archivist; - she is responsible for keeping the meeting minutes and also saving the associated meeting video for later - archiving. As each field agent connects to the video call web site, and after granting permission, their - video feed is displayed on the site. After the five other field agents checkin, Amanda calls the meeting to - order and starts the meeting recorder. The recorder captures all participant's audio, and selects a video - channel to record based on dominance of the associated video channel's audio input level. As the meeting - continues, several product prototypes are discussed. One field agent has created draft product sketch that - he shows to the group by sending the image over his video feed. This image spurs a fast-paced debate and - Amanda misses several of the participant's discussion points in the minutes. She calls for a point of order, - and requests that the participants wait while she catches up. Amanda pauses the recording, rewinds it by - thirty seconds, and then re-plays it in order to catch the parts of the debate that she missed in the - minutes. When done, she resumes the recording and the meeting continues. Toward the end of the meeting, one - field agent leaves early and his call is terminated. - </p> - <p>Requirements: <a href="#p1">P1</a>, <a href="#p2">P2</a>, <a href="#p3">P3</a>, - <a href="#lm1">LM1</a>, <a href="#lm4">LM4</a>, <a href="#rm2">RM2</a>, - <a href="#rm5">RM5</a>, <a href="#rm6">RM6</a>, <a href="#mc1">MC1</a>, <a href="#mc6">MC6</a>, - <a href="#mc11">MC11</a>, <a href="#mc12">MC12</a>, <a href="#mc13">MC13</a>. - </p> - - - <section> - <h4>Variations</h4> - <section> - <h5>Showcase demo on local screen (screen as an local media input source)</h5> - <p>During the video conference call, Amanda invites a member of the product development team to demonstrate a - new visual design editor for the prototype. The design editor is not yet finished, but has the UI elements in - place. It currently only compiles on that developer's computer, but Amanda wants the field agents' feedback - since they will ultimately be using the tool. The developer is able to select the screen as a local media - source and send that video to the group as he demonstrates the UI elements. - </p> - <p>Requirements: <a href="#lm10">LM10</a>.</p> - - </section> - </section> - </section> - - <section> - <h3>Incident on driver-download page (device fingerprinting with malicious intent)</h3> - <p> - While visiting a manufacturer's web site in order to download drivers for his new mouse, Austin unexpectedly - gets prompted by his browser to allow access to his device's webcam. Thinking that this is strange (why is - the page trying to use my webcam?), Austin denies the request. Several weeks later, Austin reads an article - in the newspaper in which the same manufacturer is being investigated by a business-sector watchdog agency - for poor business practice. Apparently this manufacturer was trying to discover how many visitors to their - site had webcams (and other devices) from a competitor. If that information could be discovered, then the - site would subject those users to slanderous advertising and falsified "webcam tests" that made it appear - as if their competitor's devices were broken in order to convince users to purchase their own brand of webcam. - </p> - <p>Requirements: <a href="#p5">P5</a>.</p> - - - <section> - <h4>Variations</h4> - <p>TBD</p> - </section> - </section> - </section> - - <section> - <h2>Requirements</h2> - <p>PERMISSIONS</p> - <ol> - - <li ><a name="p1">The UA must </a> request the user's permission before accessing any media devices.</li> - <li><a name="p2">The UA must </a>allow the user to grant permission to specific devices. (Example: front camera is ok, back camera is not.)</li> - <li><a name="p3">The UA must </a>request the user's permission before sending or receiving a media stream to or from another user.</li> - <li><a name="p4">The UA must </a>be able to store permissions granted to certain applications. [THIS NEEDS TO BE RESTRICTED.]</li> - <li><a name="p5">The UA must </a>not make information about the user's media devices available to the Application - until the user grants permission to use them.</li> - </ol> - <p>LOCAL MEDIA</p> - <ol> - <li><a name="lm1">The UA must </a>be able to show the image stream from a camera on the screen.</li> - <li><a name="lm2">The UA must </a>be able to provide a visual display of the properties of the sound captured from from a microphone (volume in this case).</li> - <li><a name="lm3">The UA must </a>be able to trigger an event based on the volume of sound over some period of -time (silence, in this case).</li> -<li><a name="lm4">The UA must </a>be able to blend remote audio streams with the local stream.</li> -<li><a name="lm5">The UA must </a>be able to continue sending and/or capturing media while the tab is in the background.</li> -<li><a name="lm6">The UA must </a>be able to extract image frames from video.</li> -<li><a name="lm7">The UA must </a>be able to insert image frames into a local video stream (or capture).</li> -<li><a name="lm8">The UA must </a>be able to modify stream parameters such as size and frame rate within the - limits set by the local hardware. </li> - <li><a name="lm9">The UA must </a>be able to display multiple local or remote streams simultaneously.</li> - <li><a name="lm10">The UA must </a>support the use of the local screen/display as a video source.</li> - <li><a name="lm11">The UA must </a>allow the user to pause or stop media streams via UXes (and not just the - buttons on the underlying hardware.) The UX provided by the chrome must override any UX provided - by the Application. </li> - <li><a name="lm12">The UA must </a>provide a UX letting the user know when it is using one or more of - his media devices.</li> - </ol> - -<p>REMOTE MEDIA</p> -<p>These requirements for remote media are outside the scope of the Media Capture task force, but are relevant to - the related WebRTC/rtcWeb work, which is a joint effort of the W3C and the IETF. The <a href="http://datatracker.ietf.org/doc/draft-ietf-rtcweb-use-cases-and-requirements"> - scenarios - and requirements document</a> for that effort contains requirements that are similar to those given here. - They are listed in parentheses after each item. .</p> -<ol> -<li><a name="rm1">The UA must </a>be able to transmit media to one or more remote sites and to receive media from them. (rtcWeb: F2, F4, F11, F12)</li> -<li><a name="rm2">The UA must </a>be able to play audio and video media received from a remote site. (rtcWeb: F4, F10)</li> -<li><a name="rm3">The UA must </a>be able to stop or pause the reception and/or transmission of any media stream independent of any other streams. (rtcWeb: A8)</li> -<li><a name="rm4">The UA must </a>be able to add new remote media connections while a recording is running. The new remote streams - may or may not be included in the recording. </li> - <li><a name="rm5">The UA must </a>be able to send or receive a still image over a video stream. </li> - <li><a name="rm6">The UA must </a>provide the Application with the parameters of all streams (for example, audio level). (rtcWeb: F14, F38, A12, A14) </li> - - </ol> - - -<p>RECORDING</p> -<ol> -<li><a name="mc1">The UA must </a>be able to record local or remote audio streams, video streams or still images from a camera or microphone - and store the result as a file.</li> -<li><a name="mc2"> The UA must </a>enable the Application to trigger recording either from a button click or a timer event.</li> -<li><a name="mc3">The UA must </a>be able to send recorded media to one or more remote locations while recording is running.</li> -<li><a name="mc4">The UA must </a>enable the Application to select the recording format and resolution from those available on the local hardware.</li> -<li><a name="mc5">the UA must </a>enable the Application to set size contraints and time limits on recording. </li> -<li><a name="mc6">The UA must </a>allow the user to switch recording between one or more local and remote streams without interruption.</li> -<li><a name="mc7">The UA must </a>enable the Application to use device properties, such as battery level, to determine when to terminate media capture.</li> -<li><a name="mc8">The UA must </a>enable error recovery in the case of premature termination of recording.</li> -<li><a name="mc9">The UA must </a>support simultaneous recording from multiple devices into separate recordings.</li> -<li><a name="mc10">The UA must </a>support simultaneous recording from multiple devices into a single recording. </li> -<li><a name="mc11">The UA must </a>support the dynamic addition and deletion of streams from a recording.</li> -<li><a name="mc12">The UA must </a>enable the Application to pause and resume the recording of local or remote streams.</li> -<li><a name="mc13">The UA must </a>enable the Application to rewind and replay a paused recorded stream.</li> - -</ol> - - </section> - - - - <section> - <h2>Design Considerations and Remarks</h2> - - <section> - <h3>Stream initialization</h3> - <p>A web application must be able to initiate a request for access to the user's webcam(s) and/or microphone(s). - Additionally, the web application should be able to "hint" at specific device characteristics that are desired by - the particular usage scenario of the application. User consent is required before obtaining access to the requested - stream.</p> - <p>When the media capture devices have been obtained (after user consent), they must be associated with a - <code>MediaStream</code> object, be active, and populated with the appropriate tracks. - The active capture devices will be configured according to user preference; the - user may have an opportunity to configure the initial state of the devices, select specific devices, and/or elect - to enable/disabled a subset of the requested devices at the point of consent or beyond—the user remains in control). - </p> - <p>It is recommended that the active <code>MediaStream</code> be associated with a browser UX in order to ensure that - the user: - <ul> - <li>is made aware that their device's webcam and/or microphone is active (for this reason many webcams include a - light or other indicator that they are active, but this is not always the case--especially with most microphones embedded in - consumer devices)</li> - <li>has a UX affordance to easily modify the capture device settings or shut off the associated capture device if necessary</li> - </ul> - Such a browser UX should be offered in a way that maintains visible even when a browser's tab (performing the capture) - is sent to the background. For the purposes of many common scenarios (especially involving real-time communications), it is not - recommended that the browser automatically shut down capture devices when the capturing browser tab is sent to the background. - If such a scenario is desired by the application author, the tab switch may be detected via other browser events (e.g., the - <a href="http://www.w3.org/TR/page-visibility/">page visibility event</a>) and the <code>MediaStream</code> can be stopped via <code>stop()</code>. - </p> - <section> - <h4>Privacy</h4> - <p>Specific information about a given webcam and/or microphone must not be available until after the user has - granted consent. Otherwise "drive-by" fingerprinting of a UA's devices and characteristics can be obtained without - the user's knowledge—a privacy issue.</p> - <p>In addition, care must be taken that webcam and audio devices are not able to record and stream data without the - user's knowledge. Explicit permission should be granted for a specific activity of a limited duration. Configuration - controls should be possible to enable age-limits on webcam use or other similar techniques.</p> - </section> - - <section> - <h4>Issues</h4> - <ol> - <li>What are the privacy/fingerprinting implications of the current "error" callback? Is it sufficiently "scary" - to warrant a change? Consider the following: - <ul> - <li>If the user doesn’t have a webcam/mic, and the developer requests it, a UA would be expected to invoke - the error callback immediately.</li> - <li>If the user does have a webcam/mic, and the developer requests it, a UA would be expected to prompt for - access. If the user denies access, then the error callback is invoked.</li> - <li>Depending on the timing of the invocation of the error callback, scripts can still profile whether the - UA does or does not have a given device capability.</li> - </ul> - </li> - <li>In the case of a user with multiple video and/or audio capture devices, what specific permission is expected to - be granted for the "video" and "audio" options presented to <code>getUserMedia</code>? For example, does "video" - permission mean that the user grants permission to any and all video capture devices? Similarly with "audio"? Is - it a specific device only, and if so, which one? Given the privacy point above, my recommendation is that "video" - permission represents permission to all possible video capture devices present on the user's device, therefore - enabling switching scenarios (among video devices) to be possible without re-acquiring user consent. Same for - "audio" and combinations of the two. - </li> - <li>When a user has only one of two requested device capabilities (for example only "audio" but not "video", and both - "audio" and "video" are requested), should access be granted without the video or should the request fail? - </li> - <li>Enabling control configuration of webcam based on age (parental control)</li> - <li>Phishing and other attacks using webcam, audio (possible issue to note)</li> - </ol> - </section> - </section> - - <section> - <h3>Stream re-initialization</h3> - - <p>After requesting (and presumably gaining access to media capture devices) it is entirely possible for one or more of - the requested devices to stop or fail (for example, if a video device is claimed by another application, or if the user - unplugs a capture device or physically turns it off, or if the UA shuts down the device arbitrarily to conserve battery - power). In such a scenario it should be reasonably simple for the application to be notified of the situation, and for - the application to re-request access to the stream. - </p> - <p>Additional information might also be useful either in terms of <code>MediaStream</code> state such as an error object, - or additional events like an <code>error</code> event (or both). - </p> - - <section> - <h4>Issues</h4> - <ol> - <li>How shall the stream be re-acquired efficiently? Is it merely a matter of re-requesting the entire - <code>MediaStream</code>, or can an "ended" mediastream be quickly revived? Reviving a local media stream makes - more sense in the context of the stream representing a set of device states, than it does when the stream - represents a network source. The WebRTC editors are considering moving the "ended" event from the - <code>MediaStream</code> to the <code>MediaStreamTrack</code> to help clarify these potential scenarios. - </li> - <li>What's the expected interaction model with regard to user-consent? For example, if the re-initialization - request is for the same device(s), will the user be prompted for consent again? Minor glitches in the stream - source connection should not revoke the user-consent. - </li> - <li>How can tug-of-war scenarios be avoided between two web applications both attempting to gain access to a - non-shared device at the same time? Should the API support the ability to request exclusive use of the - device? - </li> - </ol> - </section> - </section> - - <section> - <h3>Preview a stream</h3> - <p>The application should be able to connect a media stream (representing active media capture device(s) to one or more sinks - in order to use/view the content flowing through the stream. In nearly all digital capture scenarios, "previewing" - the stream before initiating the capture is essential to the user in order to "compose" the shot (for example, - digital cameras have a preview screen before a picture or video is captured; even in non-digital photography, the - viewfinder acts as the "preview"). This is particularly important for visual media, but also for non-visual media - like audio. - </p> - <p>Note that media streams connected to a preview output sink are not in a "capturing" state as the media stream has - no default buffer (see the <a>Stream</a> definition in section 4). Content conceptually "within" the media stream - is streaming from the capture source device to the preview sink after which point the content is dropped (not - saved). - </p> - <p>The application should be able to affect changes to the media capture device(s) settings via the media stream - and view those changes happen in the preview. - </p> - <p>Today, the <code>MediaStream</code> object can be connected to several "preview" sinks in HTML5, including the - <code>video</code> and <code>audio</code> elements. (This support should also extend to the <code>source</code> - elements of each as well.) The connection is accomplished via <code>URL.createObjectURL</code>. For RTC scenarios, - <code>MediaStream</code>s are connected to <code>PeerConnection</code> sinks. - </p> - <p>An implementation should not limit the number or kind of sinks that a <code>MediaStream</code> is connected - to (including sinks for the purpose of previewing).</p> - <section> - <h4>Issues</h4> - <ol> - <li>Audio tag preview is somewhat problematic because of the acoustic feedback problem (interference that can - result from a loop between a microphone input that picks up the output from a nearby speaker). There are - software solutions that attempt to automatically compensate for these type of feedback problems. However, it - may not be appropriate to require implementations to all support such an acoustic feedback prevention - algorithm. Therefore, audio preview could be turned off by default and only enabled by specific opt-in. - Implementations without acoustic feedback prevention could fail to enable the opt-in? - </li> - </ol> - </section> - </section> - - <section> - <h3>Stopping local devices</h3> - <p>End-users need to feel in control of their devices. Likewise, it is expected that developers using a media stream - capture API will want to provide a mechanism for users to stop their in-use device(s) via the software (rather than - using hardware on/off buttons which may not always be available). - </p> - <p>Stopping or ending a media stream source device(s) in this context implies that the media stream source device(s) - cannot be re-started. This is a distinct scenario from simply pausing the video/audio tracks of a given media stream. - </p> - <section> - <h4>Issues</h4> - <ol> - <li>Is there a scenario where end-users will want to stop just a single device, rather than all devices participating - in the current media stream? In the WebRTC case there seems to be, e.g. if the current connection cannot handle both - audio and video streams then the user might want to back down to audio, or the user just wants to drop down to audio - because they decide they don't need video. But otherwise, e.g. for local use cases, mute seems more likely and less - disruptive (e.g. in terms of CPU load which might temporarily affect recorded quality of the remaining streams). - </li> - </ol> - </section> - </section> - - <section> - <h3>Pre-processing</h3> - <p>Pre-processing scenarios are a bucket of scenarios that perform processing on the "raw" or "internal" characteristics - of the media stream for the purpose of reporting information that would otherwise require processing of a known - format (i.e., at the media stream sink—like Canvas, or via capturing and post-processing), significant - computationally-expensive scripting, etc. - </p> - <p>Pre-processing scenarios will require the UAs to provide an implementation (which may be non-trivial). This is - required because the media stream's internal format should be opaque to user-code. Note, if a future - specification described an interface to allow low-level access to a media stream, such an interface would enable - user-code to implement many of the pre-processing scenarios described herein using post-processing techniques (see - next section). - </p> - <p>Pre-processing scenarios provide information that is generally desired <i>before</i> a stream need be connected to a - sink or captured. - </p> - <p>Pre-processing scenarios apply to both real-time-communication and local capture scenarios. Therefore, the - specification of various pre-processing requirements may likely fall outside the scope of this task force. However, - they are included here for scenario-completeness and to help ensure that a media capture API design takes them into - account. - </p> - <section> - <h4>Examples</h4> - <ol> - <li>Audio end-pointing. As described in <a href="http://lists.w3.org/Archives/Public/www-archive/2011Mar/att-0001/microsoft-api-draft-final.html">a - speech API proposal</a>, audio end-pointing allows for the detection of noise, speech, or silence and raises events - when these audio states change. End-pointing is necessary for scenarios that programmatically determine when to - start and stop capturing an audio stream for purposes of hands-free speech commands, dictation, and a variety of - other speech and accessibility-related scenarios. The proposal linked above describes these scenarios in better - detail. Audio end-pointing would be required as a pre-processing scenario because it is a prerequisite to - starting/stopping a capture of the media stream itself. - </li> - <li>Volume leveling/automatic gain control. The ability to automatically detect changes in audio loudness and adjust - the input volume such that the output volume remains constant. These scenarios are useful in a variety of - heterogeneous audio environments such as teleconferences, live broadcasting involving commercials, etc. - Configuration options for volume/gain control of a media stream source device are also useful, and are explored - later on. - </li> - <li>Video face-recognition and gesture detection. These scenarios are the visual analog to the previously described - audio end-pointing scenarios. Face-recognition is useful in a variety of contexts from identifying faces in family - photographs, to serving as part of an identity management system for system access. Likewise, gesture recognition - can act as an input mechanism for a computer. - </li> - </ol> - </section> - <section> - <h4>Issues</h4> - <ol> - <li>In general the set of audio pre-processing scenarios is much more constrained than the set of possible visual - pre-processing scenarios. Due to the large set of visual pre-processing scenarios (which could also be implemented - by scenario-specific post-processing in most cases), we may recommended that visual-related pre-processing - scenarios be excluded from the scope of our task force. - </li> - <li>The challenges of specifying pre-processing scenarios will be identifying what specific information should be - conveyed by the platform at a level at which serves the widest variety of scenarios. For example, - audio-end-pointing could be specified in high-level terms of firing events when specific words of a given language - are identified, or could be as low-level as reporting when there is silence/background noise and when there's not. - Not all scenarios will be able to be served by any API that is designed, therefore this group might choose to - evaluate which scenarios (if any) are worth including in the first version of the API. - </li> - <li>Similarly to gestures, speech recognition can also be used to control the stream itself. But both uses are about - interpreting the content to derive events, it may be that these capabilities should be addressed in some other spec. - The more generic capabilities (input level monitoring) or automatic controls based upon them (e.g. AGC) however are - useful to consider here. These might be simplified (initially) to boolean options (capture auto-start/pause and AGC). - Going beyond that, input level events (e.g. threshold passing) or some realtime-updated attribute (input signal level) - on the API would be very useful in capture scenarios. - </li> - </ol> - </section> - </section> - - <section> - <h3>Post-processing</h3> - <p>Post processing scenarios are a group of all scenarios that can be completed after either:</p> - <ol> - <li>Connecting the media stream to a sink (such as the <code>video</code> or <code>audio</code> elements</li> - <li>Capturing the media stream to a known format (MIME type)</li> - </ol> - <p>Post-processing scenarios will continue to expand and grow as the web platform matures and gains capabilities. - The key to understanding the available post-processing scenarios is to understand the other facets of the web - platform that are available for use. - </p> - <p>Note: Depending on convenience and scenario usefullness, the post-processing scenarios in the toolbox below - could be implemented as pre-processing capabilities (for example the Web Audio API). In general, this document - views pre-processing scenarios as those provided by the <code>MediaStream</code> and post-processing scenarios - as those that consume a <code>MediaStream</code>.</p> - <section> - <h4>Web platform post-processing toolbox</h4> - <p>The common post-processing capabilities for media stream scenarios are built on a relatively small set of web - platform capabilities. The capabilities described here are derived from current W3C draft specifications, many - of which have widely-deployed implementations: - </p> - <ol> - <li>HTML5 <a href="http://dev.w3.org/html5/spec/Overview.html#the-video-element"><code>video</code></a> and - <a href="http://dev.w3.org/html5/spec/Overview.html#the-audio-element"><code>audio</code></a> tags. These elements are natural - candidates for media stream output sinks. Additionally, they provide an API (see - <a href="http://dev.w3.org/html5/spec/Overview.html#htmlmediaelement">HTMLMediaElement</a>) for interacting with - the source content. Note: in some cases, these elements are not well-specified for stream-type sources—this task - force may need to drive some stream-source requirements into HTML5. - </li> - <li>HTML5 <a href="http://dev.w3.org/html5/spec/Overview.html#the-canvas-element"><code>canvas</code></a> element - and the <a href="http://dev.w3.org/html5/2dcontext/">Canvas 2D context</a>. The <code>canvas</code> element employs - a fairly extensive 2D drawing API and will soon be extended with audio capabilities as well (<b>RichT, can you - provide a link?</b>). Canvas' drawing API allows for drawing frames from a <code>video</code> element, which is - the link between the media capture sink and the effects made possible via Canvas. - </li> - <li><a href="http://dev.w3.org/2006/webapi/FileAPI/">File API</a> and - <a href="http://www.w3.org/TR/file-writer-api/">File API Writer</a>. The File API provides various methods for - reading and writing to binary formats. The fundamental container for these binary files is the <code>Blob</code> - which put simply is a read-only structure with a MIME type and a length. The File API integrates with many other - web APIs such that the <code>Blob</code> can be used uniformly across the entire web platform. For example, - <code>XMLHttpRequest</code>, form submission in HTML, message passing between documents and web workers - (<code>postMessage</code>), and Indexed DB all support <code>Blob</code> use. - </li> - <li>Stream API. A new addition to - the WebApps WG, the <code>Stream</code> is another general-purpose binary container. The primary differences - between a <code>Stream</code> and a <code>Blob</code> is that the <code>Stream</code> is read-once, and has no - length. The Stream API includes a mechanism to buffer from a <code>Stream</code> into a <code>Blob</code>, and - thus all <code>Stream</code> scenarios are a super-set of <code>Blob</code> scenarios. - </li> - <li>JavaScript <a href="http://wiki.ecmascript.org/doku.php?id=strawman:typed_arrays">TypedArrays</a>. Especially - useful for post-processing scenarios, TypedArrays allow JavaScript code to crack-open a binary file - (<code>Blob</code>) and read/write its contents using the numerical data types already provided by JavaScript. - There's a cool explanation and example of TypedArrays - <a href="http://blogs.msdn.com/b/ie/archive/2011/12/01/working-with-binary-data-using-typed-arrays.aspx">here</a>. - </li> - <li><a href="http://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html">Web Audio API</a>. A proposal - for processing and synthesizing audio in web applications. Additionally, that group publishes the <a href="http://www.w3.org/TR/audioproc/"> - Audio Processing API</a> containing additional information. - </li> - </ol> - </section> - <p>Of course, post-processing scenarios made possible after sending a media stream or captured media stream to a - server are unlimited. - </p> - <section> - <h4>Time sensitivity and performance</h4> - <p>Some post-processing scenarios are time-sensitive—especially those scenarios that involve processing large - amounts of data while the user waits. Other post-processing scenarios s are long-running and can have a performance - benefit if started before the end of the media stream segment is known. For example, a low-pass filter on a video. - </p> - <p>These scenarios generally take two approaches:</p> - <ol> - <li>Extract samples (video frames/audio clips) from a media stream sink and process each sample. Note that this - approach is vulnerable to sample loss (gaps between samples) if post-processing is too slow. - </li> - <li>Capture the media stream and extract samples from the captured native format. Note that this approach requires - significant understanding of the captured native format. - </li> - </ol> - <p> - Both approaches are valid for different types of scenarios. - </p> - <p>The first approach is the technique described in the current WebRTC specification for the "take a picture" - example. - </p> - <p>The second approach is somewhat problematic from a time-sensitivity/performance perspective given that the - captured content is only provided via a <code>Blob</code> today. A more natural fit for post-processing scenarios - that are time-or-performance sensitive is to supply a <code>Stream</code> as output from a capture. - Thus time-or-performance sensitive post-processing applications can immediately start processing the [unfinished] - capture, and non-sensitive applications can use the Stream API's <code>StreamReader</code> to eventually pack - the full <code>Stream</code> into a <code>Blob</code>. - </p> - </section> - <section> - <h4>Examples</h4> - <ol> - <li>Image quality manipulation. If you copy the image data to a canvas element you can then get a data URI or - blob where you can specify the desired encoding and quality e.g. - <pre class="sh_javascript"> -canvas.toDataURL('image/jpeg', 0.6); -// or -canvas.toBlob(function(blob) {}, 'image/jpeg', 0.2);</pre> - </li> - <li>Image rotation. If you copy the image data to a canvas element and then obtain its 2D context you can then - call rotate() on that context object to rotate the displayed 'image'. You can then obtain the manipulated image - back via toDataURL or toBlob as above if you want to generate a file-like object that you can then pass around as - required. - </li> - <li>Image scaling. Thumbnails or web image formatting can be done by scaling down the captured image to a common - width/height and reduce the output quality. - </li> - <li>Speech-to-text. Post processing on a captured audio format can be done to perform client-side speech - recognition and conversion to text. Note, that speech recognition algorithms are generally done on the server for - time-sensitive or performance reasons. - </li> - </ol> - </section> - <p>This task force should evaluate whether some extremely common post-processing scenarios should be included as - pre-processing features. - </p> - </section> - - <section> - <h3>Device Selection</h3> - <p>A particular user agent may have zero or more devices that provide the capability of audio or video capture. In - consumer scenarios, this is typically a webcam with a microphone (which may or may not be combined), and a "line-in" - and or microphone audio jack. The enthusiast users (e.g., audio recording enthusiasts), may have many more available - devices. - </p> - <p>Device selection in this section is not about the selection of audio vs. video capabilities, but about selection - of multiple devices within a given "audio" or "video" category (i.e., "kind"). The term "device" and "available - devices" used in this section refers to one or a collection of devices of a kind (e.g., that provide a common - capability, such as a set of devices that all provide "video"). - </p> - <p>Providing a mechanism for code to reliably enumerate the set of available devices enables programmatic control - over device selection. Device selection is important in a number of scenarios. For example, the user selected the - wrong camera (initially) and wants to change the media stream over to another camera. In another example, the - developer wants to select the device with the highest resolution for capture. - </p> - <p>Depending on how stream initialization is managed in the consent user experience, device selection may or may not - be a part of the UX. If not, then it becomes even more important to be able to change device selection after media - stream initialization. The requirements of the user-consent experience will likely be out of scope for this task force. - </p> - <section> - <h4>Privacy</h4> - <ol> - <li>As mentioned in the "Stream initialization" section, exposing the set of available devices before giving media stream - consent leads to privacy issues. Therefore, the device selection API should only be available after consent. - </li> - <li>Device selection should not be available for the set of devices within a given category/kind (e.g., "audio" - devices) for which user consent was not granted. - </li> - </ol> - </section> - <p>Device selection should be a mechanism for exposing device capabilities which inform the application of which device to - select. In order for the user to make an informed decision about which device to select (if at all), the developer's code would - need to make some sort of comparison between devices—such a comparison should be done based on device capabilities rather - than a guess, hint, or special identifier (see related issue below). - </p> - <p>Capture capabilities are an important decision-making point for media capture scenarios. However, capture capabilities - are not directly correlated with individual devices, and as such should not be mixed with the device capabilities. For - example, the capability of capturing audio in AAC vs. MP3 is not correlated with a given audio device, and therefore not a - decision making factor for device selection. - </p> - <section> - <h4>Issues</h4> - <ol> - <li>The specification should provide guidance on what set of devices are to be made available—should it be the set of - potential devices, or the set of "currently available" devices (which I recommended since the non-available devices can't - be utilized by the developer's code, thus it doesn't make much sense to include them). - </li> - <li>A device selection API should expose device capability rather than by device identity. Device identity is a poor practice - because it leads to device-dependent testing code (for example, if "Name Brand Device", then…) similar to the problems that - exist today on the web as a result of user-agent detection. A better model is to enable selection based on capabilities. - Additionally, knowing the GUID or hardware name is not helpful to web developers as part of a scenario other than device - identification (perhaps for purposes of providing device-specific help/troubleshooting, for example). - </li> - <li>One strategy is to not return a set of devices, only the one that the user selected. Thus whether a device is "available" - (meaning known by the system, and able to be connected to at the current time) is something that could presented through the - browser UI and include other info (e.g. description of the device e.g. "front"/"back"/"internal"/"USB"/"Front Door"/...) as - known. Providing a list of cameras requires then that the app be capable of some decision making, and thus requires more info - which again is a privacy concern (resulting in a potential two-stage prompt: "Do you allow this app to know what cameras are - connected" then "Do you allow this app to connect to the 'front' camera?"). - </li> - </ol> - </section> - </section> - - <section> - <h3>Change user-selected device capabilities</h3> - <p>In addition to selecting a device based on its capabilities, individual media capture devices may support multiple modes of - operation. For example, a webcam often supports a variety of resolutions which may be suitable for various scenarios (previewing - or capturing a sample whose destination is a web server over a slow network connection, capturing archival HD video for storing - locally). An audio device may have a gain control, allowing a developer to build a UI for an audio blender (varying the gain on - multiple audio source devices until the desired blend is achieved). - </p> - <p>A media capture API should support a mechanism to configure a particular device dynamically to suite the expected scenario. - Changes to the device should be reflected in the related media stream(s) themselves. - </p> - <p>If a device supports sharing (providing a virtual version of itself to an app), any changes to the device's manipulatable state - should by isolated to the application requesting the change. For example, if two applications are using a device, changes to the - device's configuration in one app should not affect the other one. - </p> - <p>Changes to a device capability should be made in the form of requests (async operations rather than synchronous commands). - Change requests allow a device time to make the necessary internal changes, which may take a relatively long time without - blocking other script. Additionally, script code can be written to change device characteristics without careful error-detection - (because devices without the ability to change the given characteristic would not need to throw an exception synchronously). - Finally, a request model makes sense even in RTC scenarios, if one party of the teleconference, wants to issue a request that - another party mute their device (for example). The device change request can be propagated over the <code>PeerConnection</code> - to the sender asynchronously. - </p> - <p>In parallel, changes to a device's configuration should provide a notification when the change is made. This allows web - developer code to monitor the status of a media stream's devices and report statistics and state information without polling the - device (especially when the monitoring code is separate from the author's device-control code). This is also essential when the - change requests are asynchronous; to allow the developer to know at which point the requested change has been made in the media - stream (in order to perform synchronization, or start/stop a capture, for example). - </p> - <section> - <h4>Issues</h4> - <ol> - <li>If changing a particular device capability cannot be virtualized, this media capture task force should consider whether that - dynamic capability should be exposed to the web platform, and if so, what the usage policy around multiple access to that - capability should be. - </li> - <li>The specifics of what happens to a capture-in-progress when device behavior is changed must be described in the spec. - </li> - </ol> - </section> - </section> - - <section> - <h3>Multiple active devices</h3> - <p>In some scenarios, users may want to initiate capture from multiple devices at one time in multiple media streams. For example, - in a home-security monitoring scenario, a user agent may want to capture 10 unique video streams representing various locations being - monitored. The user may want to collect all 10 of these videos into one capture, or capture all 10 individually (or some - combination thereof). - </p> - <p>While such scenarios are possible and should be supported (even if they are a minority of the typical web-scenarios), it should be - noted that many devices (especially portable devices) supports the media capture by way of dedicated encoder hardware, and such hardware - may only be able to handle one stream at a time). Implementations should be able to provide a failure condition when multiple video sources - are attempting to begin capture at the same time. - </p> - </section> - - <section> - <h3>Capturing a media stream</h3> - <p>In its most basic form, capturing a media stream is the process of converting the media stream into a known format during a - bracketed timeframe. - </p> - <p>Local media stream captures are common in a variety of sharing scenarios such as: - </p> - <ul> - <li>capture a video and upload to a video sharing site</li> - <li>capture a picture for my user profile picture in a given web app</li> - <li>capture audio for a translation site</li> - <li>capture a video chat/conference</li> - </ul> - <p>There are other offline scenarios that are equally compelling, such as usage in native-camera-style apps, or web-based capturing - studios (where tracks are captured and later mixed). - </p> - <p>The core functionality that supports most capture scenarios is a simple start/stop capture pair. - </p> - <p>Ongoing captures should report progress either via the user agent, or directly through an API to enable developers to build UIs - that pass this progress notification along to users. - </p> - <p>A capture API should be designed to gracefully handle changes to the media stream, and should also report (and perhaps even - attempt to recover from) failures at the media stream source during capture. - </p> - <p>Uses of the captured information is covered in the Post-processing scenarios described previously. An additional usage is the - possibility of default save locations. For example, by default a UA may store temporary captures (those captures that are - in-progress) in a temp (hidden) folder. It may be desirable to be able to specify (or hint) at an alternate default capture - location such as the users' common file location for videos or pictures. - </p> - <section> - <h4>DVR Scenarios</h4> - <p>Increasingly in the digital age, the ability to pause, rewind, and "go live" for streamed content is an expected scenario. - While this scenario applies mostly to real-time communication scenarios (and not to local capture scenarios), it is worth - mentioning for completeness. - </p> - <p>The ability to quickly "rewind" can be useful, especially in video conference scenarios, when you may want to quickly go - back and hear something you just missed. In these scenarios, you either started a capture from the beginning of the conference - and you want to seek back to a specific time, or you were only streaming it (not saving it) but you allowed yourself some amount - of buffer in order to review the last X minutes of video. - </p> - <p>To support these scenarios, buffers must be introduced (because the media stream is not implicitly buffered for this scenario). - In the capture scenario, as long as the UA can access previous parts of the capture (without terminating it) then this scenario - could be possible. - </p> - <p>In the streaming case, this scenario could be supported by adding a buffer directly into the media stream itself, or by capturing - the media stream as previously mentioned. Given the complexities of integrating a buffer into the <code>MediaStream</code> proposal, - using capture to accomplish this scenario is recommended. - </p> - </section> - <section> - <h4>Issues</h4> - <ol> - <li>There are few (if any) scenarios that require support for overlapping captures of a single media stream. Note, that the - <code>record</code> API (as described in early WebRTC drafts) implicitly supports overlapping capture by simply calling - <code>record()</code> twice. In the case of separate media streams (see previous section) overlapping recording makes sense. In - either case, initiating multiple captures should not be so easy so as to be accidental. - </li> - </ol> - </section> - </section> - - <section> - <h3>Selection of a capture method</h3> - <p>All post-processing scenarios for captured data require a known [standard] format. It is therefore crucial that the media capture - API provide a mechanism to specify the capture format. It is also important to be able to discover if a given format is supported. - </p> - <p>Most scenarios in which the captured data is sent to the server for upload also have restrictions on the type of data that the server - expects (one size doesn't fit all). - </p> - <p>It should not be possible to change captures on-the-fly without consequences (i.e., a stop and/or re-start or failure). It is - recommended that the mechanism for specifying a capture format not make it too easy to change the format (e.g., setting the format - as a property may not be the best design). - </p> - <section> - <h4>Format detection</h4> - <ul> - <li>If we wish to re-use existing web platform concepts for format capability detection, the HTML5 <code>HTMLMediaElement</code> - supports an API called <code>canPlayType</code> which allows developer to probe the given UA for support of specific MIME types that - can be played by <code>audio</code> and <code>video</code> elements. A capture format checker could use this same approach. - </li> - </ul> - </section> - </section> - - <section> - <h3>Programmatic activation of camera app</h3> - <p>As mentioned in the introduction, declarative use of a capture device is out-of-scope. However, there are some potentially interesting - uses of a hybrid programmatic/declarative model, where the configuration of a particular media stream is done exclusively via the user - (as provided by some UA-specific settings UX), but the fine-grained control over the stream as well as the stream capture is - handled programmatically. - </p> - <p>In particular, if the developer doesn't want to guess the user's preferred settings, or if there are specific settings that may not be - available via the media capture API standard, they could be exposed in this manner. - </p> - </section> - - <section> - <h3>Take a picture</h3> - <p>A common usage scenario of local device capture is to simply "take a picture". The hardware and optics of many camera-devices often - support video in addition to photos, but can be set into a specific "camera mode" where the possible capture resolutions are - significantly larger than their maximum video resolution. - </p> - <p>The advantage to having a photo-mode is to be able to capture these very high-resolution images (versus the post-processing scenarios - that are possible with still-frames from a video source). - </p> - <p>Capturing a picture is strongly tied to the "video" capability because a video preview is often an important component to setting up - the scene and getting the right shot. - </p> - <p>Because photo capabilities are somewhat different from those of regular video capabilities, devices that support a specific "photo" - mode, should likely provide their "photo" capabilities separately from their "video" capabilities. - </p> - <p>Many of the considerations that apply to video capture also apply to taking a picture. - </p> - <section> - <h4>Issues</h4> - <ol> - <li>What are the implications on the device mode switch on video captures that are in progress? Will there be a pause? Can this - problem be avoided? - </li> - <li>Should a "photo mode" be a type of user media that can be requested via <code>getUserMedia</code>? - </li> - </ol> - </section> - </section> - - <section> - <h3>Picture tracks</h3> - <p>Another common scenario for media streams is to share photos via a video stream. For example, a user may want to select a photo and - attach the photo to an active media stream in order to share that photo via the stream. In another example, the photo can be used as a - type of "video mute" where the photo can be sent in place of the active video stream when a video track is "disabled". - </p> - <section> - <h4>Issues</h4> - <ol> - <li>It may be desireable to specify a photo/static image as a track type in order to allow it to be toggled on/off with a video track. - On the other hand, the sharing scenario could be fulfilled by simply providing an API to supply a photo for the video track "mute" - option (assuming that there's not a scenario that involves creating a parallel media stream that has both the photo track and the current - live video track active at once; such a use case could be satisfied by using two media streams instead). - </li> - </ol> - </section> - </section> - - <section> - <h3>Caption Tracks</h3> - <p>The HTML5 <code>HTMLMediaElement</code> now has the ability to display captures and other "text tracks". While not directly applicable to - local media stream scenarios (caption support is generally done out-of-band from the original capture), it could be something worth adding in - order to integrate with HTML5 videos when the source is a PeerConnection where real-time captioning is being performed and needs to be displayed. - </p> - </section> - - </section> - - <section> - <h2>Acknowledgements</h2> - <p>Special thanks to the following who have contributed to this document: - Harald Alvestrand, - Robin Berjon, - Stefan Hakansson, - Frederick Hirsch, - Randell Jesup, - Bryan Sullivan, - Timothy B. Terriberry, - Tommy Widenflycht. - </p> - </section> + <p>This document has <a href="http://w3c.github.io/mediacapture-scenarios/scenarios.html">moved to github</a>.</p> </body> </html>
Received on Thursday, 4 September 2014 07:52:44 UTC