Re: [w3ctag/design-reviews] WebXR Device API (#403) from Brandon Jones on 2019-09-05 (public-webapps-github@w3.org from September 2019)

From: Brandon Jones <notifications@github.com>
Date: Wed, 04 Sep 2019 17:41:36 -0700
To: w3ctag/design-reviews <design-reviews@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <w3ctag/design-reviews/issues/403/528147224@github.com>
Thank you for your feedback! I'll answer what I can below, with some tasks broken out into separate issues/PRs as indicated.

Focusing on the **Explainer/API** questions first, since those can generally be answered more concisely:

> * Please add a table of contents!

Thank you for demonstrating an effective way to do this in [your explainer PR](https://github.com/immersive-web/webxr/pull/818). If we don't merge that PR directly we'll be sure to add a TOC ourselves soon.

> * The explainer lists immersive video as a use case. Why would we not design an extension to `<video>` for this case?

We would very much like to see immersive playback in the `<video>` tag in the near future, but feel that implementing WebXR is an appropriate first step to getting there, in the spirit of [the extensible web manifesto](https://extensiblewebmanifesto.org/). Specifically, immersive `<video>` support can effectively be polyfilled with WebXR, while the reverse is not true. And, of course, a more general API like WebXR can also support many other non-video use cases, which has already proven to be valuable.

Additionally, there is not yet consensus on the video/audio formats and projection techniques that are optimal for these use cases. (This is a similar problem to map projection, in that there's no "perfect" way to lay out the surface of a sphere on a flat plane.) Similarly, we've seen on the 2D web that various video players are not satisfied with the default video controls and will frequently provide their own. It's reasonable to expect that trend to continue with immersive video and it is not yet clear what the appropriate mechanism is for providing custom controls in that environment, whereas in WebXR it's implicitly the application's responsibility to render them.

By starting with an imperative API we give developers a lot more flexibility in how they store, transmit, display, and control their content which ideally will help inform future discussions around what knobs and levers are necessary to add to the `<video>` tag. (And even then WebXR will serve as a fallback if your content doesn't fit into one of the canonical formats.) We do expect, and already see, libraries built around the API to simplify video playback, and would anticipate that those libraries could trivially redirect their functionality to a `video` tag should support be added in the future.

> * Why does `navigator.xr.supportsSession()` not return a `Promise<boolean>` rather than rejecting in the case that the session type is not supported? That would seem like a better semantic match to the wording of the method, as well as not requiring the author to program a simple feature detection step in a try/catch style.

I've [opened an issue](https://github.com/immersive-web/webxr/issues/824) for further discussion on this topic, since it's one of the few potentially breaking changes you've brought up. It seems to me, though, like our usage here is in line with other similar methods that return a `Promise<void>` in APIs such as WebUSB and WebAudio. Are there guidelines regarding this type of use that we could refer to?

> * Could you elaborate on why `inline` is the default?

This was actually left in the explainer erroneously. There is no default mode, which is reflected in the rest of the explainer and spec IDL. ([PR to fix](https://github.com/immersive-web/webxr/pull/825)) Historically it was default because it was the mode which requires the least user consent.

> * Naming: it seems like the `vr` in `immersive-vr` is both redundant and inaccurate (since it doesn't encompass AR). Could it just be `immersive`?

We intend to introduce an `immersive-ar` mode in a spec module soon after WebXR ships. In a previous iteration of the spec we specified the session mode as a dictionary, which treated "immersive" as a separate boolean and had a separate field for specifying that AR capabilities were desired like so:

```webidl
// Not compatible with the current spec!
navigator.xr.requestSession({
  immersive: true,
  ar: true
}).then(/*...*/);
```

The primary issue this introduced was that it implied that a non-immersive AR mode was a possibility, when we had no intent of ever supporting it. Plus every new mode that is added would then have to reason about how it interacted with each of those booleans even if they weren't necessarily applicable. The use of enums was eventually deemed to be a cleaner approach.

> * The explainer doesn't provide example code for avoiding or handling WebGL context loss. Is the author supposed to avoid it altogether by using `makeXRCompatible()`, or are there other factors to consider?

Issue filed to [ensure we demonstrate handling context loss](https://github.com/immersive-web/webxr/issues/826).

More generally, there are two routes to ensuring context compatibility. If the context is created with the `xrCompatible: true` context creation argument, then the returned context will be compatible with WebXR uses and no context loss will be incurred for that reason. (The system may still lose the context for other reasons, such as reinstalling the graphics driver.) This is appropriate for pages who's primary purpose is to display WebXR content. For pages where immersive context is a secondary feature making the context compatible from the start may introduce undesired side effects (such as causing the context to run on a discreet GPU instead of a more battery-friendly integrated GPU), and so the compatibility bit can be set late using the `makeXRCompatible()` method. This may force a context loss on some devices if the context needs to be moved to a new adapter (while on others, such as those with only a single GPU, it can be a no-op).

> * Similarly, a code example for handling XR device changes would be useful.

Issue filed to [add a `devicechange` event code sample](https://github.com/immersive-web/webxr/issues/827)

> * Could you deep-link to the section in the Spatial Tracking Explainer which explains how to handle `getViewerPose()` failures?

I'm not sure exactly what this is asking for? Deep link from where?

> * Might it be helpful to provide a code example showing how to use the `transform` property of the `XRViewerPose` value?

Issue files to [add more code samples for `XRRigidTransform` use](https://github.com/immersive-web/webxr/issues/828)

> * Could you expand on the key concept of a Layer?

A layer is simply an image that will be displayed on the XR hardware somehow. Right now it's pretty minimal, with only a WebGL layer being exposed initially and only one active layer being allowed at a time. But we have known features that we'd like to implement in the future that would expand the types of layers that could be used and give more flexibility to how they're presented. For example, when WebGPU ships we would introduce a new layer type that allows a WebGPU context to render to the headset, and shorter term we'd like to add a layer type that takes better advantage of WebGL 2 features.

Other examples of how we may use layers in the future:
 - Displaying encrypted video streams
 - Displaying DOM content
 - Higher quality 2D surfaces

> * What are the key differences between an `XRWebGLLayer` and a `<canvas>`?

Slightly oversimplifying here, but a `<canvas>` is for compositing on the page and an `XRWebGLLayer` is for compositing on the headset. Both may share a WebGL Context, and in the end both are simply hosts for a framebuffer that WebGL binds and renders into. By making the `XRWebGLLayer` a distinct concept we have greater control over the framebuffer that it exposes and create it in a way that's optimal for XR.

It's worth noting that previously in WebVR we effectively used the `<canvas>` as the layer, but this caused several problems that all had their root in the fact that a web page and a headset are very different mediums and benefit from tailor-fit approaches. A couple of simple examples:

 - We were requiring developers to resize the canvas element to a resolution that was appropriate for the headset, which is typically quite large. This was easy to get wrong and frequently resulted in either grainy imagery in the headset or significantly oversized canvases on the page.
 - Presenting to a headset typically required taking ownership of the framebuffer that was going to be displayed, which often required an expensive copy because we didn't know if the same buffer would be shown on the page as well.
 - The canvas may be constructed with options (such as `preserveDrawingBuffer: true`) that weren't appropriate for use with XR hardware and introduced even more unnecessary overhead.

> * When might a session be blurred?

Having a concrete example in the explainer of when this state might apply would be a good idea. `visible-blurred` indicates that the user can see the application and it should respond appropriately to head movement to avoid user discomfort, but the user cannot interact with the app because input is captured by the system/UA. The most common scenario for this mode we see today is that many immersive computing devices have a "dasboard" can be pulled up without quitting the immersive application by pressing a dedicated button on the controller. Similarly, if it doesn't pose a privacy/security risk, the UA may choose to display some dialogs to the user without exiting the immersive app.

A quick visual aid, showing Oculus' dashboard system:
![oculus-dash-gif-small](https://user-images.githubusercontent.com/805273/64291950-264c1680-cf1e-11e9-84a9-38afc4ffe3f1.gif)

Not all platforms support this type of interaction, especially if power is limited, and in those cases we would expect the session to only toggle between `visible` to `hidden` directly. Alternatively, the UA may take steps to reduce the app quality (such as lowering it's resolution) to improve performance while a dialog is up, which is allowed by the `visible-blurred` state.

> Obviously I'd like to see the question of [accessibility](https://www.w3.org/WAI/APA/task-forces/research-questions/wiki/Accessibility_of_Virtual_Reality) addressed sooner rather than later.

We definitely understand the importance of accessibility, and also want to ensure that immersive web content does not unnecessarily exclude users due to faulty assumptions on the part of developers about the user's abilities. This is a large topic, however, and one that we've been seeing more discussion on recently, and so I think it would be more productive for us to outline our current thinking about accessibility in a separate doc which we'll link here. Needless to say, it's a complicated problem made more difficult by the imperative nature of the rendering APIs we rely on, the relative newness of the VR ecosystem, and the type of content the device capabilities encourage.  It seems likely that our accessibility story will span API enhancements, working with tool and content developers to take advantage of existing accessibility features when appropriate, encouraging best practices around use of audio and haptics, and detailing UA-level accessibility features that can apply to all content.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3ctag/design-reviews/issues/403#issuecomment-528147224
Received on Thursday, 5 September 2019 00:41:59 UTC