Re: Media Capture Depth Stream Extension - call for review from Kostiainen, Anssi on 2014-03-13 (public-media-capture@w3.org from March 2014)

From: Kostiainen, Anssi <anssi.kostiainen@intel.com>
Date: Thu, 13 Mar 2014 16:40:25 +0000
To: "Cullen Jennings (fluffy)" <fluffy@cisco.com>, "Hu, Ningxin" <ningxin.hu@intel.com>, Rob Manson <roBman@mob-labs.com>
CC: "public-media-capture@w3.org" <public-media-capture@w3.org>
Message-ID: <2AE7AF88-0028-4FE5-8FAC-8BCA138CF056@intel.com>

Hi Cullen, Ningxin, Rob, All,

On 13 Mar 2014, at 16:58, Cullen Jennings (fluffy) <fluffy@cisco.com> wrote:

> I like the direction you are going with this … few small comments

Cullen - thanks for your comments.

> I think the term "range" would be better than "depth" as that is what is used in lots of other contexts.

We’re certainly open to suggestions how to name things. A name that is the most natural to web developers should be chosen.

> For cameras that do do this based on triangulation (either laser or stereo), you end up with really bad fidelity in some cases when you use integers because the values tend to linearly related with inverse of distance. At the browser API level, I think this would work best if the values were float in point numbers - preferable representing range in meters. If integers are used, it gets complicated on how to map them and depends on actually distance. So I strongly support the proposal in the issues section to use FLOAT in meters. 

I can see how getting the actual distance would be helpful.

Ningxin - perhaps you can clarify whether there are known issues in getting the actual distance from depth cameras in the market today, or if there are any other issues related to this the group should be aware of?

Re floats, for interacting with the ImageData and CanvasRenderingContext2D, the most straight-forward way would perhaps be to use one of the ArrayBufferView types:

  https://www.khronos.org/registry/typedarray/specs/latest/#7

Cullen - I guess you’d prefer Float64Array that corresponds to unrestricted double in WebIDL types?

How to wire this into the WebGLRenderingContext is work in progress. There are basically two alternatives as noted in the last bullet of the Issues section:

  https://www.w3.org/wiki/Media_Capture_Depth_Stream_Extension#Issues

> Imagine a stereo camera that gives me a left and right camera plus a range image. So a call to getUseMedia might return two  video tracks plus a range track. I think we need to say something about the range image is registered to the left or right camera. Probably just saying that the first range track is registered to whatever the first video track is would do. Am I making any sense here or do I need to explain this better ?

I think I get what you mean. This requirement indeed needs to be addressed, if not in v1, in the later iterations of the spec.

One general solution that scales to N number of streams of any type would probably be (in abstract) something like what is being proposed as the MediaDeviceInfo.groupId. Or alternatively, to provide a method that takes a MediaStream as an argument, and returns other associated MediaStreams, if any.

Thanks,

-Anssi

>> [1] http://lists.w3.org/Archives/Public/public-media-capture/2014Jan/0039.html
>> [2] https://www.w3.org/wiki/Media_Capture_Depth_Stream_Extension#Use_Cases
>> [3] https://www.w3.org/wiki/Media_Capture_Depth_Stream_Extension#Spec_Extensions
>> [4] https://www.w3.org/wiki/Media_Capture_Depth_Stream_Extension#Examples
>> [5] http://www.w3.org/html/wg/wiki/ExtensionSpecifications

Received on Thursday, 13 March 2014 16:42:37 UTC