Re: Proposal Virtual Reality "View Lock" Spec

I think it could make sense to put stuff like this as an extension on top
of WebGL and WebAudio as they are the only two current APIs close enough to
the bare metal/low latency/high performance to get a decent experience.
 Also - I seem to remember that some earlier generation VR glasses solved
the game support problem by providing their own GL and Joystick drivers
(today - probably device orientation events) so many games didn't have to
bother (too much) with the integration.

In theory - we could:

 - extend (if needed at all) WebGL to provide stereo vision
 - hook up WebAudio as is (as it supports audio objects, Doppler effect,
etc. similar to OpenAL
 - hook up DeviceOrientation/Motion in Desktop browsers if a WiiMote, HMD
or other is connected
 - hook up getUserMedia as is to the potential VR camera

..and make it possible to do low latency paths/hooks between them if needed.

It seems that all (or at least most) of the components are already present
- but proper hooks need to be made for desktop browsers at least (afaik ..
it's been a while ;))

- Lars


On Wed, Mar 26, 2014 at 7:18 PM, Brandon Jones <bajones@google.com> wrote:

> So there's a few things to consider regarding this. For one, I think your
> ViewEvent structure would need to look more like this:
>
> interface ViewEvent : UIEvent {
>     readonly attribute Quaternion orientation; // Where Quaternion is 4
> floats. Prevents gimble lock.
>     readonly attribute float offsetX; // offset X from the calibrated
> center 0 in millimeters
>     readonly attribute float offsetY; // offset Y from the calibrated
> center 0 in millimeters
>     readonly attribute float offsetZ; // offset Z from the calibrated
> center 0 in millimeters
>     readonly attribute float accelerationX; // Acceleration along X axis
> in m/s^2
>     readonly attribute float accelerationY; // Acceleration along Y axis
> in m/s^2
>     readonly attribute float accelerationZ; // Acceleration along Z axis
> in m/s^2
> }
>
> You have to deal with explicit units for a case like this and not
> clamped/normalized values. What would a normalized offset of 1.0 mean? Am I
> slightly off center? At the other end of the room? It's meaningless without
> a frame of reference. Same goes for acceleration. You can argue that you
> can normalize to 1.0 == 9.8 m/s^2 but the accelerometers will happily
> report values outside that range, and at that point you might as well just
> report in a standard unit.
>
> As for things like eye position and such, you'd want to query that
> separately (no sense in sending it with every device), along with other
> information about the device capabilities (Screen resolution, FOV, Lens
> distortion factors, etc, etc.) And you'll want to account for the scenario
> where there are more than one device connected to the browser.
>
> Also, if this is going to be a high quality experience you'll want to be
> able to target rendering to the HMD directly and not rely on OS mirroring
> to render the image. This is a can of worms in and of itself: How do you
> reference the display? Can you manipulate a DOM tree on it, or is it
> limited to WebGL/Canvas2D? If you can render HTML there how do the
> appropriate distortions get applied, and how do things like depth get
> communicated? Does this new rendering surface share the same Javascript
> scope as the page that launched it? If the HMD refreshes at 90hz and your
> monitor refreshes at 60hz, when does requestAnimationFrame fire? These
> are not simple questions, and need to be considered carefully to make sure
> that any resulting API is useful.
>
> Finally, it's worth considering that for a VR experience to be effective
> it needs to be pretty low latency. Put bluntly: Browser suck at this.
> Optimizing for scrolling large pages of flat content, text, and images is
> very different from optimizing for realtime, super low latency I/O. If you
> were to take an Oculus Rift and plug it into one of the existing
> browser/Rift demos <https://github.com/Instrument/oculus-bridge> with
> Chrome, you'll probably find that in the best case the rendering lags
> behind your head movement by about 4 frames. Even if your code is rendering
> at a consistent 60hz that means you're seeing ~67ms of lag, which will
> result in a motion-sickness-inducing "swimming" effect where the world is
> constantly catching up to your head position. And that's not even taking
> into account the question of how well Javascript/WebGL can keep up with
> rendering two high resolution views of a moderately complex scene,
> something that even modern gaming PCs can struggle with.
>
> That's an awful lot of work for technology that, right now, does not have
> a large user base and for which the standards and conventions are still
> being defined. I think that you'll have a hard time drumming up support for
> such an API until the technology becomes a little more widespread.
>
> (Disclaimer: I'm very enthusiastic about current VR research. If I sound
> negative it's because I'm being practical, not because I don't want to see
> this happen)
>
> --Brandon
>
>
> On Wed, Mar 26, 2014 at 12:34 AM, Brandon Andrews <
> warcraftthreeft@sbcglobal.net> wrote:
>
>> I searched, but I can't find anything relevant in the archives. Since
>> pointer lock is now well supported, I think it's time to begin thinking
>> about virtual reality APIs. Since this is a complex topic I think any spec
>> should start simple. With that I'm proposing we have a discussion on adding
>> a head tracking. This should be very generic with just position and
>> orientation information. So no matter if the data is coming from a webcam,
>> a VR headset, or a pair of glasses with eye tracking in the future the
>> interface would be the same. This event would be similar to mouse move with
>> a high sample rate (which is why in the event the head tracking and eye
>> tracking are in the same event representing a user's total view).
>>
>> interface ViewEvent : UIEvent {
>>     readonly attribute float roll; // radians, positive is slanting the
>> head to the right
>>     readonly attribute float pitch; // radians, positive is looking up
>>     readonly attribute float yaw; // radians, positive is looking to the
>> right
>>     readonly attribute float offsetX; // offset X from the calibrated
>> center 0 in the range -1 to 1
>>     readonly attribute float offsetY; // offset Y from the calibrated
>> center 0 in the range -1 to 1
>>     readonly attribute float offsetZ; // offset Z from the calibrated
>> center 0 in the range -1 to 1, and 0 if not supported
>>     readonly attribute float leftEyeX; // left eye X position in screen
>> coordinates from -1 to 1 (but not clamped) where 0 is the default if not
>> supported
>>     readonly attribute float leftEyeY; // left eye Y position in screen
>> coordinates from -1 to 1 (but not clamped) where 0 is the default if not
>> supported
>>     readonly attribute float rightEyeX; // right eye X position in screen
>> coordinates from -1 to 1 (but not clamped) where 0 is the default if not
>> supported
>>     readonly attribute float rightEyeY; // right eye Y position in screen
>> coordinates from -1 to 1 (but not clamped) where 0 is the default if not
>> supported
>> }
>>
>> Then like the pointer lock spec the user would be able to request view
>> lock to begin sampling head tracking data from the selected source. There
>> would thus be a view lock change event.
>> (It's not clear how the browser would list which sources to let the user
>> choose from. So if they had a webcam method that the browser offered and an
>> Oculus Rift then both would show and the user would need to choose).
>>
>> Now for discussion. Are there any features missing from the proposed head
>> tracking API or features that VR headsets offer that need to be included
>> from the beginning? Also I'm not sure what it should be called. I like
>> "view lock", but it was my first thought so "head tracking" or something
>> else might fit the scope of the problem better.
>>
>> Some justifications. The offset and head orientation are self explanatory
>> and calibrated by the device. The eye offsets would be more for a UI that
>> selects or highlights things as the user moves their eyes around. Examples
>> would be a web enabled HUD on VR glasses and a laptop with a precision
>> webcam. The user calibrates with their device software which reports the
>> range (-1, -1) to (1, 1) in screen space. The values are not clamped so the
>> user can look beyond the calibrated ranges. Separate left and right eye
>> values enable precision and versatility since most hardware supporting eye
>> tracking will have raw values for each eye.
>>
>>
>>
>

Received on Wednesday, 26 March 2014 18:58:52 UTC