2009/dap/camera Overview.html,1.30,1.31

Update of /sources/public/2009/dap/camera
In directory hutz:/tmp/cvs-serv6289

Modified Files:
	Overview.html 
Log Message:
moved requirements and use cases to the end, as appendix
moved future, related, ack as appendix too
fixed links to related documents


Index: Overview.html
===================================================================
RCS file: /sources/public/2009/dap/camera/Overview.html,v
retrieving revision 1.30
retrieving revision 1.31
diff -u -d -r1.30 -r1.31
--- Overview.html	2 Dec 2009 14:30:44 -0000	1.30
+++ Overview.html	2 Dec 2009 14:35:51 -0000	1.31
@@ -28,82 +28,6 @@
     <section>
 			<h2>Introduction</h2>
 <p>The Capture API defines a high-level interface for accessing the microphone and camera of a hosting device.</p>
-<h3>Requirements</h3>
-<p>The Capture API: </p>
-<ul>
-<li>MUST enable capture of static images </li>
-<li>MUST enable capture of videos (including audio) </li>
-<li>MUST enable listing the available cameras </li>
-<li>MUST enable listing the available formats and codecs, per camera </li>
-<li>MUST enable displaying a viewfinder </li>
-<li>SHOULD enable control of the camera's capabilities (e.g. zoom, luminosity, night mode, focus mode)</li>
-<li>SHOULD enable setting of brightness, contrast, gain</li>
-<li>SHOULD enable displaying a viewfinder as part of the document (e.g. as a <code>video</code> element [[HTML5]]) </li>
-<li>MUST enable capture of audio </li>
-<li>MUST enable listing the available audio input devices </li>
-<li>MUST enable listing the available formats and codecs, per audio input device </li>
-<li>SHOULD enable setting microphone audio level</li>
-<li>MUST enable retrieval of the captured content </li>
-<li>MUST provide some metadata about the captured content (e.g. width, height, format, duration)</li>
-<li>MUST enable choosing preferred aspects of the captured content (e.g. width, height, format, frame rate)</li>
-<li>MUST support asynchronous, cancellable capture </li>
-</ul>
-<div class="issue">
-<p>If the user requests a given capture size which isn't available, do we refuse or do we fall back? If the latter (which is likely) what is the algorithm that is used to find the fallback? It could be (given a request for 1000x50):
-</p>
-<ul>
-<li>the camera's preferred default </li><li>500x100 (closest number of pixels) </li><li>1000x700 (closest longest side) </li><li>2000x100 (closest ratio) </li></ul>
-<p></p>
-</div>
-<p class="issue">We could very easily get bogged down in specifying camera capabilities and format feature variants &#8212; how do we decide which ones are reasonably in?
-</p>
-<div class="issue">
-<p>We probably need to support more than 1 camera. On some of the newer device, there is a camera point at the user and 
-another on the other side pointing at the subject. This allows for such usage as see-what-I-see.
-</p>
-</div>
-<h3>Use cases</h3>
-<p>This section contains a set of use cases collected for the Capture API. Note that this section might be removed in future versions of the document.
-</p>
-<h4>Picture Capture</h4>
-<p>A (web-based) camera application that allow the user to capture image with and without preview mode using the device camera capability.
-Also allow the user to capture multiple images in burst mode.
-</p>
-<h4>Panorama Image Capture</h4>
-<p>A (web-based) camera application that allow the user to capture panorama images with and without preview mode using the device camera capability.
-When the user select panorama mode, the view finder displays an indication that it is ready to take the 1st image of 3. 
-the user points the device starting from the left and presses the Take button. The device takes the image indicated 
-by an alert, then goes back to take mode for the next image in the sequence. 
-The view finder displays a 1/8 overlay of previous image on the left side so the user can line up for the next image. 
-After taking all 3 images that makes up the panorama picture, the device displays the picture on the screen for a 
-second before going back to the view finder mode. 
-</p>
-<h4>Video chat</h4>
-<p>The use case is to be able to write a web app that implements a voice/video chat client. This could be as part of an instant messaging client, or might be a standalone videophone or 'telephone'. Another example might be an online 'chat with customer service'
- link on the web site that downloaded the web app that allowed the customer to do this directly.
-</p>
-<h5>Discussion</h5>
-<p>Video output can be handled with the &lt;video&gt; tag. However, video input is not so easy as there is no obvious way to pass captured video in real time to the server. You might think that you could use the preview URL as proposed in one API as a way, but there
- is no obvious way to pass the data coming out of this URL down (for example) a websocket.
-</p>
-<p>The approach of rendering the preview into a %lt;canvas&gt; and then scraping the canvas, re-encoding the data and transmitting it seems too ugly (and too inefficient) to be useful. The rendering approach also doesn't work for the associated audio stream. Worse,
- the preview data stream might not include the audio anyway. </p>
-<p>An ideal approach would be to define a websocket like interface onto the camera/microphone (it might even be as simple as defining a method to get a web sockets URL for the camera/microphone). Another alternative (which would cause more upheaval) would be
- to add the websocket read/write interface onto XmlHttpRequest and then have the camera expose an HTTP URL for the full audio/video data stream.
-</p>
-<h4>Web cam</h4>
-<p>A (web-based) surveillance application that would allow the user to survey their property remotely. The camera would allow for 
-some type of control such as moving the camera left, right, up and down. Another usage would allow for the surveillance web application 
-to monitor for movement and trigger a notification such as email or alert to the user</p>
-<h4>Voice search</h4>
-<p>A (web-based) search application might offer the user to speak the search query into the device, e.g. while holding a push-to-talk button or triggered by a proximity sensor (use case for sensor API). The users utterance has to be recorded (captured) and
- may be sent over the network to a network based speech recognizer. </p>
-<h5>Discussion</h5>
-<p>To avoid latency while sending the captured voice sample to the network based speech recognizer, the voice should be recorded in a compressed format. The API should allow to select a compression format.
-</p>
-<h4>Voice memo</h4>
-<p>A (web-based) voice recorder application which allow the user to record a memo for later playback.</p>
-</section>
 <section id="examples">
 <h2>Usage Examples</h2>
 <p>The following code extracts illustrate how to work with a camera service in the hosting device:
@@ -153,6 +77,7 @@
     alert(summary);</pre>
 </div>
 </section>
+</section>
 <section id="security">
 <h2>Security and Privacy Considerations</h2>
 The API defined in this specification launches the capture application which allows the user to take pictures, record voice or record video and provides a handle to the content. This information can potentially compromise user privacy and a conforming implementation of this specification
@@ -266,14 +191,92 @@
 </dt><dd>Cancel/clear the pending asynchronous operation. </dd></dl>
 </section>
 </section>
+<section class='appendix'>
+<h3>Requirements</h3>
+<p>The Capture API: </p>
+<ul>
+<li>MUST enable capture of static images </li>
+<li>MUST enable capture of videos (including audio) </li>
+<li>MUST enable listing the available cameras </li>
+<li>MUST enable listing the available formats and codecs, per camera </li>
+<li>MUST enable displaying a viewfinder </li>
+<li>SHOULD enable control of the camera's capabilities (e.g. zoom, luminosity, night mode, focus mode)</li>
+<li>SHOULD enable setting of brightness, contrast, gain</li>
+<li>SHOULD enable displaying a viewfinder as part of the document (e.g. as a <code>video</code> element [[HTML5]]) </li>
+<li>MUST enable capture of audio </li>
+<li>MUST enable listing the available audio input devices </li>
+<li>MUST enable listing the available formats and codecs, per audio input device </li>
+<li>SHOULD enable setting microphone audio level</li>
+<li>MUST enable retrieval of the captured content </li>
+<li>MUST provide some metadata about the captured content (e.g. width, height, format, duration)</li>
+<li>MUST enable choosing preferred aspects of the captured content (e.g. width, height, format, frame rate)</li>
+<li>MUST support asynchronous, cancellable capture </li>
+</ul>
+<div class="issue">
+<p>If the user requests a given capture size which isn't available, do we refuse or do we fall back? If the latter (which is likely) what is the algorithm that is used to find the fallback? It could be (given a request for 1000x50):
+</p>
+<ul>
+<li>the camera's preferred default </li><li>500x100 (closest number of pixels) </li><li>1000x700 (closest longest side) </li><li>2000x100 (closest ratio) </li></ul>
+<p></p>
+</div>
+<p class="issue">We could very easily get bogged down in specifying camera capabilities and format feature variants &#8212; how do we decide which ones are reasonably in?
+</p>
+<div class="issue">
+<p>We probably need to support more than 1 camera. On some of the newer device, there is a camera point at the user and 
+another on the other side pointing at the subject. This allows for such usage as see-what-I-see.
+</p>
+</div>
+</section>
+<section>
+<h3>Use cases</h3>
+<p>This section contains a set of use cases collected for the Capture API. Note that this section might be removed in future versions of the document.
+</p>
+<h4>Picture Capture</h4>
+<p>A (web-based) camera application that allow the user to capture image with and without preview mode using the device camera capability.
+Also allow the user to capture multiple images in burst mode.
+</p>
+<h4>Panorama Image Capture</h4>
+<p>A (web-based) camera application that allow the user to capture panorama images with and without preview mode using the device camera capability.
+When the user select panorama mode, the view finder displays an indication that it is ready to take the 1st image of 3. 
+the user points the device starting from the left and presses the Take button. The device takes the image indicated 
+by an alert, then goes back to take mode for the next image in the sequence. 
+The view finder displays a 1/8 overlay of previous image on the left side so the user can line up for the next image. 
+After taking all 3 images that makes up the panorama picture, the device displays the picture on the screen for a 
+second before going back to the view finder mode. 
+</p>
+<h4>Video chat</h4>
+<p>The use case is to be able to write a web app that implements a voice/video chat client. This could be as part of an instant messaging client, or might be a standalone videophone or 'telephone'. Another example might be an online 'chat with customer service'
+ link on the web site that downloaded the web app that allowed the customer to do this directly.
+</p>
+<h5>Discussion</h5>
+<p>Video output can be handled with the &lt;video&gt; tag. However, video input is not so easy as there is no obvious way to pass captured video in real time to the server. You might think that you could use the preview URL as proposed in one API as a way, but there
+ is no obvious way to pass the data coming out of this URL down (for example) a websocket.
+</p>
+<p>The approach of rendering the preview into a %lt;canvas&gt; and then scraping the canvas, re-encoding the data and transmitting it seems too ugly (and too inefficient) to be useful. The rendering approach also doesn't work for the associated audio stream. Worse,
+ the preview data stream might not include the audio anyway. </p>
+<p>An ideal approach would be to define a websocket like interface onto the camera/microphone (it might even be as simple as defining a method to get a web sockets URL for the camera/microphone). Another alternative (which would cause more upheaval) would be
+ to add the websocket read/write interface onto XmlHttpRequest and then have the camera expose an HTTP URL for the full audio/video data stream.
+</p>
+<h4>Web cam</h4>
+<p>A (web-based) surveillance application that would allow the user to survey their property remotely. The camera would allow for 
+some type of control such as moving the camera left, right, up and down. Another usage would allow for the surveillance web application 
+to monitor for movement and trigger a notification such as email or alert to the user</p>
+<h4>Voice search</h4>
+<p>A (web-based) search application might offer the user to speak the search query into the device, e.g. while holding a push-to-talk button or triggered by a proximity sensor (use case for sensor API). The users utterance has to be recorded (captured) and
+ may be sent over the network to a network based speech recognizer. </p>
+<h5>Discussion</h5>
+<p>To avoid latency while sending the captured voice sample to the network based speech recognizer, the voice should be recorded in a compressed format. The API should allow to select a compression format.
+</p>
+<h4>Voice memo</h4>
+<p>A (web-based) voice recorder application which allow the user to record a memo for later playback.</p>
+</section>
 <section id="related">
 <h2>Related documents</h2>
-<p>This section contains a list of related information for editorial purposes. Note that this section will be removed in later versions of the document.
-</p>
+<p>The API described in this document took inspiration from the following documents:</p>
 <ul>
-<li><a href="redir.aspx?C=a80e3c073aeb4bacbe5718db26f3c9bd&amp;URL=http%3a%2f%2fcode.google.com%2fp%2fgears%2fwiki%2fCameraAPI" target="_blank">Google Camara API</a>
-</li><li><a href="redir.aspx?C=a80e3c073aeb4bacbe5718db26f3c9bd&amp;URL=http%3a%2f%2fbondi.omtp.org%2f1.1%2fCR%2fapis%2fcamera.html" target="_blank">BONDI 1.1 camera API</a>
-</li><li><a href="redir.aspx?C=a80e3c073aeb4bacbe5718db26f3c9bd&amp;URL=http%3a%2f%2flists.w3.org%2fArchives%2fPublic%2fpublic-device-apis%2f2009Apr%2fatt-0001%2fcamera.html" target="_blank">Nokia Camara API</a>
+<li><a href="http://code.google.com/p/gears/wiki/CameraAPI">Google Camera API</a>
+</li><li><a href="http://bondi.omtp.org/1.0/apis/camera.html">BONDI 1.1 camera API</a>
+</li><li><a href="http://lists.w3.org/Archives/Public/public-device-apis/2009Apr/att-0001/camera.html">Nokia Camara API</a>
 </li></ul>
 </section>
 <section id="future">
@@ -283,7 +286,7 @@
 <ul>
 <li>... </li></ul>
 </section>
-<section id="ack">
+<section  id="ack">
 <h2>Acknowledgements</h2>
 <p>Many thanks to Google, Nokia, and OMTP BONDI who provided the initial input into this specification.
 </p>

Received on Wednesday, 2 December 2009 14:35:55 UTC