Re: Question on Camera Permissions from Blair MacIntyre on 2018-08-21 (public-immersive-web@w3.org from August 2018)

From: Blair MacIntyre <bmacintyre@mozilla.com>
Date: Tue, 21 Aug 2018 13:44:52 -0700
To: public-immersive-web@w3.org, Leonard Daly <web3d@realism.com>
Message-ID: <CADvj9uwUkzJfbKtW6nZS=jhxvKfmYt_hFsgr37zi9oBN6oVVZA@mail.gmail.com>
Right now, WebGL supports the ability to block readback; for example, you
cannot read the pixels out of textures created from certain kinds of
images.  So, there is no reason that some set of actions could not cause
WebGL readback to fail.

BUT, using such an approach to access camera data is a horrible idea;
 reading back the frame buffer seems like it would stall the GPU, for
example.  Also, the camera image is now the resolution of the screen, not
the original camera stream.  And so on.

In my opinion, this capability (reading back camera data from a texture or
frame buffer) should be explicitly disallowed, and called out as a bug.

There were multiple discussions over the past year about whether WebXR
should expose an API to retrieve camera data.  I even went to the extreme
of implementing one in the WebXR Viewer.  At the time, there was
significant pushback from folks in the group saying that if we want to give
access to camera data, we should enhance existing APIs (like gUM) to
provide the data we want.  In my blog post out these computer vision
experiments, I included a number of pointers to proposed extensions to gUM,
some have which have been tested an implemented (by folks at Intel, for
example).


Regarding your list, I would reorganize it, based on discussions I’ve had
with security folks here:

1. Straight to display (not available to JavaScript)
2. Mesh and other platform inferred data available to JavaScript
3. Camera + platform inferred data available to JavaScript


I swap the last two because once we give javascript the camera video,
especially when the camera data is coupled with accurate pose estimates,
it’s “trivial" to do full reconstruction of the space, along with
super-resolution textures from overlapping video frames.  Giving video +
webxr-scale tracking is truly “giving away the farm”, and when you think
about the kinds of shady things unsavory characters will do on the web,
it’s actually somewhat terrifying.

So, I would actually expect that a successful WebXR AR ecosystem would have
most apps limiting themselves to 1 and 2 above.  Very few apps (mostly from
very trusted sources) would reasonably expect users to consent to full
camera access.

On August 21, 2018 at 3:06:09 PM, Leonard Daly (web3d@realism.com) wrote:

On the call today there was a discussion regarding the means for handling
camera data and making it available to a developer of browser code. I
presume that for an AR (browser) application the camera (preferably
environment) is on and composited into the display along with elements
provided by the web page.

The discussion centered on the levels of permission that might be defined.
The list included

   1. Straight to display (not available to JavaScript)
   2. Available to JavaScript
   3. Camera + mesh available to JavaScript

At some point in the device's GPU the video stream must be composited with
the computer rendered elements. From what I have read (not done) of WebGL,
it should be possible to read the composited pixels from the GPU's frame
buffer and return them to JavaScript.

There may be a restriction built into WebGL that prevents reading a frame
buffer that is not created by that JavaScript context. If this is true,
then the read operation would return the generated elements + transparent
pixels and I don't think there is a problem.
--
*Leonard Daly*
3D Systems Architect & Cloud Consultant
President, Daly Realism - *Creating the Future*



--
Blair MacIntyre
Principal Research Scientist
bmacintyre@mozilla.com
https://pronoun.is/he/him
https://blairmacintyre.me
Received on Tuesday, 21 August 2018 20:45:16 UTC