W3C home > Mailing lists > Public > public-media-capture@w3.org > July 2014

[depth] depth value encoding proposal

From: Hu, Ningxin <ningxin.hu@intel.com>
Date: Fri, 4 Jul 2014 00:16:06 +0000
To: "public-media-capture@w3.org" <public-media-capture@w3.org>
CC: "Kostiainen, Anssi" <anssi.kostiainen@intel.com>, Rob Manson <roBman@buildAR.com>
Message-ID: <1298B14A1D0704468AE73FC92557A16225689C07@SHSMSX104.ccr.corp.intel.com>
Hi,

As working on Media Capture Depth Stream Extension [1], we need to specify the depth value representation of Depth Stream (tracked as issue11 [2]). The precision and range of depth value are important to 3D object scan use case (UC2 in corresponding wiki [3]). To achieve good 3D object scan quality, for example itSeez3D demo [4], it requires at least 16-bits depth representation with 1 millimeter precision.

Native depth camera SDKs usually provides 16bit integer (short) with millimeter unit depth value, such as MS Kinect SDK and Intel Perceptual Computing SDK etc.,
On web platform, Depth Stream should also expose the 16-bits depth value via Canvas. However, the problem is today's either Canvas or WebGL doesn't support 16-bits image format.

One possible solution is to tone-map 16-bits to 8-bits. Chrome's implementation for Tango's depth camera employs this method [5]. it throws lower 4 bits and right shift upper 8 bits. If the raw depth unit is millimeter, it can represent 0-4096mm range and 16mm precision depth value. Tone-mapped 8-bits depth encoding is simple, e.g. the WebGL shader decoder sample is very straightforward [6]. However, our prototype shows it generates 'bricky' point cloud as [7].

Alternatively, another solution is to encode 16-bits into 3x8-bits components (RGB). This paper [8] elaborates this encoding scheme in very details. It can represent 16-bits depth value (0-65535mm range and 1mm precision ) as well as work with existing Canvas/WebGL image format. It also has good quality via video codec which is important for WebRTC usage. Although this encoding is more complex (sample WebGL shader decoder at [9]), we didn't find performance issues when prototyping. 16-bits depth value generates "smooth" point cloud, just as prototype shows [10]

So we propose to use 16-bits to 3x8-bits encoding in Media Capture Depth Stream Extension spec before web platform supports 16-bits image format widely.

Do you have any concerns?

Thanks,
-ningxin

[1] https://github.com/w3c/mediacapture-depth
[2] https://github.com/w3c/mediacapture-depth/issues/11
[3] https://www.w3.org/wiki/Media_Capture_Depth_Stream_Extension
[4] https://sketchfab.com/models/f75614f2297e4ed8bda99e6c24e0d469
[5] https://src.chromium.org/viewvc/chrome?revision=260798&view=revision
[6] https://github.com/huningxin/depthcam_three.js/blob/gh-pages/index.html#L81
[7] https://cloud.githubusercontent.com/assets/1005673/3476322/5eeb18da-02fd-11e4-9a87-16093c76e032.png
[8] http://web4.cs.ucl.ac.uk/staff/j.kautz/publications/depth-streaming.pdf
[9] https://github.com/huningxin/depthcam_three.js/blob/16bits-depth/index.html#L79
[10] https://cloud.githubusercontent.com/assets/1005673/3476336/a288731c-02fd-11e4-847a-f8b95a468eff.png
Received on Friday, 4 July 2014 00:16:43 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:26:28 UTC