W3C home > Mailing lists > Public > public-media-capture@w3.org > June 2014

Focal length/fov capabilities and general camera intrinsics

From: Rob Manson <roBman@buildAR.com>
Date: Fri, 20 Jun 2014 09:31:36 +1000
Message-ID: <53A372D8.8060003@buildAR.com>
To: "public-media-capture@w3.org" <public-media-capture@w3.org>
CC: "Kostiainen, Anssi" <anssi.kostiainen@intel.com>, "Hu, Ningxin" <ningxin.hu@intel.com>

as part of defining the Media Capture Depth Stream Extension we have 
proposed that the depth streams can be accessed in one of two ways.
1. As a stand along stream of depth images.
2. As a combination of RGB and D images.

This combined stream in 2. is commonly used by applications to do things 
like building a colour based texture map at the same time as building a 
3D scene reconstruction[1][2], etc. These Use Cases are covered by at 
least UC2 and UC5 in our initial extension proposal[3].

However, in order to provide this combined RGB and D depth stream it's 
important that these two streams are calibrated so the data from each 
can be aligned. This is essential, otherwise it is not technically 
possible to mathematically reconstruct or combine the relevant data models.

In order to calibrate these two streams it is important to have access 
to the Camera Intrinsics for both the depth sensor and the rgb camera. 
For those not familiar with the pinhole camera model here is some useful 
background information.
- a very brief introduction[4]
- a software focused introduction [5]
- a more mathematically focused introduction [6]

Once we have a description of the dimensions of the image plane (e.g. 
image width and height) and the focal length then we can calculate the 
field of view (or vice-versa) and map points from the 2D image plane 
into the 3D scene. This is not only useful for the Media Capture Depth 
Stream Extension but also useful for Augmented Reality image stream 
processing in general.

At the moment the Media Capture and Streams API provides access to the 
capabilities of sources including framerate, width and height.

In order to deliver an accurately calibrated RGB and D stream it is 
necessary for us to extend these capabilities to include either Focal 
Length and/or Horizontal and Vertical Field of View.

If we look at existing models like those provided by the Android 
Camera.Parameters API we can see that these values are already defined 
and commonly in use.
- Focal Length[7]
- Horizontal Field of View (View Angle)[8]
- Vertical Field of View (View Angle)[9]

By making these extra capabilities available to developers we will open 
up a whole new range of stream post processing use cases and 
applications. Without them it is not possible to create a calibrated RGB 
and D stream and much of the utility of the Depth Stream Extension will 
be blocked. I hope I have also provided some good evidence that this 
information is already commonly available in a programmatic form (at 
least on the Android platform).

We'll look forward to hearing your thoughts and feedback.


[1] http://reconstructme.net/qa_faqs/how-do-i-create-a-3d-selfie/
[2] http://skanect.occipital.com/sample-models/
[3] http://www.w3.org/wiki/Media_Capture_Depth_Stream_Extension
[4] http://en.wikipedia.org/wiki/Camera_resectioning#Intrinsic_parameters
[6] http://www.ics.uci.edu/~majumder/VC/classes/cameracalib.pdf
Received on Thursday, 19 June 2014 23:30:54 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:24:48 UTC