RE: 3D Pointers from Jacob Rossi on 2013-03-22 (public-pointer-events@w3.org from January to March 2013)

From: Jacob Rossi <Jacob.Rossi@microsoft.com>
Date: Fri, 22 Mar 2013 21:06:06 +0000
To: Bill Fisher <fisherwebdev@gmail.com>
CC: Arthur Barstow <art.barstow@nokia.com>, "public-pointer-events@w3.org" <public-pointer-events@w3.org>, "jhull@leapmotion.com" <jhull@leapmotion.com>
Message-ID: <83dc13ca58584642913263189a9c5672@BY2PR03MB028.namprd03.prod.outlook.com>
Hi Bill,

Our group's decision to tackle additional device types in V2 is about focusing our efforts to get web developers an interoperable way to handle the devices that are already ubiquitous while laying a foundation for the next generation of pointing devices thereafter. Adding a new dimension to input is not a light task and will surely spark a lot of great discussion. While I don't want to get into the specifics of these here, the kinds of things I'd expect we'd discuss (amongst others) are:

- Specific use cases. We can't design a good API without first understand how developers will use it. With mouse, pen, and touch, this is well understood. Programs like the Leap Motion Developer Community and the Kinect for Windows SDK are helping pave this out, but we need to understand how this specifically plays out on the Web as it can be a different medium.
- Coordinate space and units. I think there's more discussion to be had here. Thinking with just your proposal, there could be multiple CSS transform perspectives on a given page, which complicates things. But also, that kind of transform might not make since for recognizing in-air gestures (where physical units might be more appropriate).
- Device independence. One of the things this group has worked on is ensuring all devices can provide at least some basic level of support for the various properties on PointerEvent. So we'd want to have a discussion about what the values are for pressure, tilt, etc. in 3D space.
- Support for multiple device vendors. We should try to avoid multiple 3D pointing models and understand how other devices, like Kinect, Google Glass, or Nintendo Wii, play into this model.

I'm sure there's more as well. Along the same lines of what you're describing, I do think adding support for these devices will be additive though.

As it stands now, we haven't made any substantial changes to the spec during the Last Call review period. So we're likely getting ready to move to Candidate Recommendation soon, which means we're on the cusp of getting the web an improved model for pointing input. If we try to design 3D input for the web in the V1 spec, then we'll just delay that realization for current devices (at a minimum, requiring another Last Call publication and review period). Also worth mentioning is the eminent requirement to demonstrate two interoperable implementations of the spec. We're well on our way to being able to do that for the current APIs, but adding 3D into the mix will certainly delay that for quite some time.

So my recommendation is to experiment, through script, how the full API works end to end. Contribute to things like LeapJS or KinectJS and build up the right set of functionality, vetted by examples that use it. With that we'll, have a strong proposal that makes this easy to get into V2 and in the hands of web developers.

-Jacob

PS - FWIW, I just finally got invited into the Leap dev program (coincidence?). So I'm looking forward to playing around with the device and LeapJS. :-)

On Thu, Mar 21, 2013 at 11:15 AM, Bill Fisher <fisherwebdev@gmail.com> wrote:
>
> Hi Jacob and Art and others,
>
> Thank you for following up on this, and for adding a reference to this thread within the notes on pointer event extensibility (http://www.w3.org/wiki/PointerEvents/UseCasesAndRequirements).
>
> The EdgeConf discussion (https://www.youtube.com/watch?v=zxuA-wyajhY&feature=player_detailpage#t=2454s) provides excellent context for this thread, but I find that I very respectfully disagree when Boris Smus argues against the introduction of the Z-coordinate, stating, "It's unclear what the units would be for all this stuff.  You're breaking the connection ... of mapping to a screen coordinate.  As soon as you're dealing with tracking real world stuff, it's in some different coordinate system.  If you can bring it back to screen space, you're doing it with some weird transform."
>
> I think a transformed unit is actually an ideal way of expressing the Z dimension on the Web, and its precedent exists in CSS transforms.  If we have a CSS perspective, then the browser should be able to map from the device driver data to the web developer's intended use of 3D space.  That is, all we really need is a *relative* unit to come from the device drivers, and the established perspective would provide a mapping into Z-axis pixels to dovetail with CSS transforms.  For example, if the spec asked for a floating point value from 1 to -1 to come from the device driver data, the browser could easily transform this data into something easy for the web developer to use, based on how the developer has set up the 3D space.  I'm no expert on WebGL, but I did discuss this with a WebGL developer, and I believe this relative/transformed approach would work there as well.
>
> The underlying issue here is probably an old debate at the W3C -- the question of whether the Web Platform should get out in front of emerging technology so as to compete effectively with native and proprietary platforms, or whether it is better to first "allow a thousand flowers to bloom" and wait to see what patterns emerge from the marketplace.  My own opinion is that for the Web to become the platform that we all want it to be, it needs to give both browsers and device manufacturers something to aim toward.  Otherwise we have a thousand Betamax flowers, no VHS, and the Web is left lagging behind with Super8.
>
> I think this discussion resonates, to a small degree, with the discussion on the "Last Call comments" thread about tiltX/Y vs. spherical coordinates.  That is, the spec appears to be aimed almost exclusively at mouse, touch and stylus input, and I would very much like to see the spec broaden its vision and scope.
>
> One change to the spec could achieve this: the introduction of a Z-coordinate.  If this would require too much processor power, I could see that as a reason to not implement this.  Otherwise, I think the spec shortchanges the Web by not including this kind of data.  If we have to wait for v2 to see this data, 3D motion detection on the Web will be totally fragmented until that happens, partially and significantly crippling the Web's ability to take advantage of this new input paradigm.
>
> - Bill
Received on Friday, 22 March 2013 21:08:25 UTC