RE: [depth] depth value encoding proposal

Hi Benjamin,

On native platform, it is usually to represent depth image as Gray16 or Z16 format which are also seen as kind of image. For the meta data, say depth units, FOV and focal length etc.,, in another thread, Rob proposed them to be capabilities [1].
What do you think?

Thanks,
-ningxin

[1] http://lists.w3.org/Archives/Public/public-media-capture/2014Jun/0152.html


From: Benjamin Schwartz [mailto:bemasc@google.com]
Sent: Tuesday, August 19, 2014 1:18 PM
To: Harald Alvestrand
Cc: public-media-capture@w3.org
Subject: Re: [depth] depth value encoding proposal

I misunderstood some of the preceding discussion, not realizing that all mention of depth-to-color mappings had already been removed.  That's good.  However, the proposed solution (ImageData with special types) has several remaining problems, as noted in the draft's "Issue 1".

I think depth and color are sufficiently different that it might be better to create a new DepthData type than to reuse the ImageData type.  A DepthData object could also indicate whether the units are known (e.g. millimeters) or unknown, and specify the physical field of view, along with other valuable metadata.

On Tue, Aug 19, 2014 at 3:31 PM, Harald Alvestrand <harald@alvestrand.no<mailto:harald@alvestrand.no>> wrote:
On 08/19/2014 06:01 PM, Benjamin Schwartz wrote:
On Tue, Aug 19, 2014 at 10:58 AM, Kostiainen, Anssi <anssi.kostiainen@intel.com<mailto:anssi.kostiainen@intel.com>> wrote:
Hi Benjamin,

Thanks for your comments, and sorry for the late reply due to the vacation period.

I noticed the following comments (1) and (2) you’ve made, and would like to check the status, and ask your help to fill in the gaps if any:

(1) "I do not think this kind of arbitrary mapping is appropriate in a W3C standard.  We should arrange for a natural representation instead, with modifications to Canvas and WebGL if necessary.” [1]

To make sure I understood this correctly:

Firstly, you're proposing we patch the CanvasRenderingContext2D and make ImageData.data of type ArrayBufferView instead of Uint8ClampedArray to allow the Uint16Array type, correct? Better suggestions?

I would suggest going all the way to Float32 as well.

Secondly, extend the WebGLRenderingContext along the lines of the LUMINANCE16 extension proposal [2] -- or would you prefer to use the depth component of a GL texture as you previously suggested? Known issues? I’d like to hear your latest thoughts on this and if possible a concrete proposal how you’d prefer this to be spec’d to be practical for developers and logical for implementers.

I'd strongly prefer to use a depth component (and also have an option to use 32-bit float).  This would raise the GLES version requirement for WebGL, but I think this is not an unreasonable requirement for a feature that also requires a depth camera!

(2) "I don't think getDepthTracks() should return color-video.  If you want to prototype using depth-to-color mappings, the logical way is to treat the depth channel as a distinct color video camera, accessible by the usual device selection mechanisms.” [3]

This is what the spec currently says re getDepthTracks():

[[

The getDepthTracks() method, when invoked, must return a sequence of MediaStreamTrack objects representing the depth tracks in this stream.

The getDepthTracks() method must return a sequence that represents a snapshot of all the MediaStreamTrack objects in this stream's track set whose kind is equal to "depth". The conversion from the track set to the sequence is user agent defined and the order does not have to be stable between calls.

]]

Do you have a concrete proposal how you’d tighten the prose to clear the confusion?

I would not include |getDepthTracks| until we have a depth datatype.  Instead, I would add each depth camera as an input device of kind: 'video' in the list returned by enumerateDevices(), with its label containing a human-readable indication that its color video data is computed from depth via a mapping defined by the user agent.

But once this spec is implemented, we have a depth datatype.... or did I misunderstand the proposal?

I'm afraid that if we spec out a path where depth is treated as a weird kind of video, we'll never escape from it - so I'm happy to see such a fully-worked proposal that includes both the depth datatype and the depth track definition.

Received on Wednesday, 20 August 2014 18:11:35 UTC