Focal length/fov capabilities and general camera intrinsics from Rob Manson on 2014-06-19 (public-media-capture@w3.org from June 2014)

From: Rob Manson <roBman@buildAR.com>
Date: Fri, 20 Jun 2014 09:31:36 +1000
To: "public-media-capture@w3.org" <public-media-capture@w3.org>
CC: "Kostiainen, Anssi" <anssi.kostiainen@intel.com>, "Hu, Ningxin" <ningxin.hu@intel.com>
Message-ID: <53A372D8.8060003@buildAR.com>

Hi,

as part of defining the Media Capture Depth Stream Extension we have
proposed that the depth streams can be accessed in one of two ways.
1. As a stand along stream of depth images.
2. As a combination of RGB and D images.

This combined stream in 2. is commonly used by applications to do things
like building a colour based texture map at the same time as building a
3D scene reconstruction[1][2], etc. These Use Cases are covered by at
least UC2 and UC5 in our initial extension proposal[3].

However, in order to provide this combined RGB and D depth stream it's
important that these two streams are calibrated so the data from each
can be aligned. This is essential, otherwise it is not technically
possible to mathematically reconstruct or combine the relevant data models.

In order to calibrate these two streams it is important to have access
to the Camera Intrinsics for both the depth sensor and the rgb camera.
For those not familiar with the pinhole camera model here is some useful
background information.
- a very brief introduction[4]
- a software focused introduction [5]
- a more mathematically focused introduction [6]

Once we have a description of the dimensions of the image plane (e.g.
image width and height) and the focal length then we can calculate the
field of view (or vice-versa) and map points from the 2D image plane
into the 3D scene. This is not only useful for the Media Capture Depth
Stream Extension but also useful for Augmented Reality image stream
processing in general.

At the moment the Media Capture and Streams API provides access to the
capabilities of sources including framerate, width and height.

In order to deliver an accurately calibrated RGB and D stream it is
necessary for us to extend these capabilities to include either Focal
Length and/or Horizontal and Vertical Field of View.

If we look at existing models like those provided by the Android
Camera.Parameters API we can see that these values are already defined
and commonly in use.
- Focal Length[7]
- Horizontal Field of View (View Angle)[8]
- Vertical Field of View (View Angle)[9]

By making these extra capabilities available to developers we will open
up a whole new range of stream post processing use cases and
applications. Without them it is not possible to create a calibrated RGB
and D stream and much of the utility of the Depth Stream Extension will
be blocked. I hope I have also provided some good evidence that this
information is already commonly available in a programmatic form (at
least on the Android platform).

We'll look forward to hearing your thoughts and feedback.

roBman

[1] http://reconstructme.net/qa_faqs/how-do-i-create-a-3d-selfie/
[2] http://skanect.occipital.com/sample-models/
[3] http://www.w3.org/wiki/Media_Capture_Depth_Stream_Extension
[4] http://en.wikipedia.org/wiki/Camera_resectioning#Intrinsic_parameters
[5]
http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html
[6] http://www.ics.uci.edu/~majumder/VC/classes/cameracalib.pdf
[7]
http://developer.android.com/reference/android/hardware/Camera.Parameters.html#getFocalLength%28%29
[8]
http://developer.android.com/reference/android/hardware/Camera.Parameters.html#getHorizontalViewAngle%28%29
[9]
http://developer.android.com/reference/android/hardware/Camera.Parameters.html#getVerticalViewAngle%28%29

Received on Thursday, 19 June 2014 23:30:54 UTC