Re: [w3ctag/design-reviews] WebXR Hand Input API Specification (#568)

> Regarding naming: I see that the [Unity hand tracking API](https://developer.oculus.com/documentation/unity/unity-handtracking/#understanding-bone-id), for example, doesn't use the medical names for bones. They use a number to indicate the joint number.

A problem with this is that it's not extensible, we're not exposing all of the hand joints that exist, we're exposing all of the hand joints that are typically used in VR hand tracking.

I find numbering to be more confusing because different platforms may choose to index differently: e.g. the indexing changes based on which carpals and metacarpals you include. For example, on Oculus/Unity only the thumb and pinky fingers have metacarpals,  and the thumb also has a trapezium carpal bone. On the other hand (hah), OpenXR provides a metacarpal bone for all fingers, but no trapezium bone. So numbers don't really carry a cross-platform useful meaning.

If you just want to iterate over all of the joints, you can do that without knowing the names, but if you're going to be thinking about detecting gestures, I find names and a diagram to be far easier than plain numbers. Most humans have more than these 25 bones (+ tip "bones") in each hand, "index joint 0" doesn't tell me anything unless you show me a diagram.


> I don't think using the bone name reduces the ambiguity either, since you're referring to a joint rather than the bone in any case.

it's both, really, since the orientation of that space is aligned with the named bone.

> I'm also trying to understand the relationship between `hand` as a member of `XRInputSource`, and the [primary action](https://immersive-web.github.io/webxr/#primary-action) concept. Does hand input provide a way of generating a primary action?

Yes, but that's not under the purview of this spec at all. Oculus Browser and Hololens use pinch/grab actions for the primary action. The precise selection for the primary action gesture is up to the platform defaults.

> Actual code wouldn't use those structures. I think @Manishearth provided that to clarify how the mapping is done.

The first example is doing this because it is outdated: it is iterable now so you don't need that array. The second example does need this.

I considered a structured approach in the past but there are basically many different ways to slice this data based on the gesture you need, so it made more sense to surface it as an indexable iterator and let people slice it themselves. Also, starting with a structured approach now may lock us out of being able to handle hands with more or less than five fingers in the future.

I can update the explainer to use the iterator where possible!


> Also, can you give some background on how hand tracking works for people who are missing or unable to use one or more fingers on the hand(s) being used for hand tracking - how does this affect the data which is provided to the application?

As Rik said, https://github.com/immersive-web/webxr-hand-input/issues/11 covers this. At the moment this is entirely based on platform defaults: some platforms may emulate a finger, others may not detect it as a hand (unfortunate, but not something we can control here).

Currently all of the hand tracking platforms out there are all-or-nothing, AIUI, which means that they will always report all joints, and if some joints don't exist they'll either emulate them or refuse to surface a hand.

I want to make progress here, but I fear that doing so without having platforms that support it is putting the cart before the horse. A likely solution would be where you can use an XR feature descriptor to opt in to joints being missing as an indicator of "I can handle whatever configuration you throw at me". Polydactyl hands will also need a similar approach.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3ctag/design-reviews/issues/568#issuecomment-736898162

Received on Wednesday, 2 December 2020 00:03:15 UTC