Re: Accessibility at the W3C Workshop on Web Games - feedback and questions from Joshue O Connor on 2019-08-22 (public-apa@w3.org from August 2019)

From: Joshue O Connor <joconnor@w3.org>
Date: Thu, 22 Aug 2019 15:04:04 +0100
To: Matthew Tylee Atkinson <matkinson@paciellogroup.com>
Cc: "White, Jason J" <jjwhite@ets.org>, "public-apa@w3.org" <public-apa@w3.org>
Message-ID: <8a6cf8b2-2dff-0490-0022-d58947799368@w3.org>
Hi Matthew,

On 21/08/2019 16:28, Matthew Tylee Atkinson wrote:
> Hi Joshue, Jason,
>
> Sorry for my laggy reply; we've been on holiday :-). Thanks Jason for the link in your email to the WebAssembly WebIDL bindings proposal, though it seems they've re-organised the repo since, and the URL now appears to be <https://github.com/WebAssembly/interface-types/blob/master/proposals/interface-types/Explainer.md>.

Great, thanks for the updated URL.


> Thanks for your comment on the Level Description Language (LDL) paper; glad you enjoyed it. I'm working on making it run in a more user-friendly manner on contemporary systems [1]. Would love to see someone with better geometry skills apply the same approach on newer engines :-).

Do keep us updated.


> You had a few questions; I'll see what I can do to start answering them...
>
>> How can we know which runtime environment, rendering or VM machine environment - when used as a platform for gaming or XR applications, provides the best architecture for accessibility and is sympatico with existing AT?
> There are a few different levels of environment we might consider:
>
>   * Underlying platform (native app, native mobile app, console, browser).
>   * Engine/middleware (e.g. Unity, Unreal, ...).
>   * Game-specific code (such as UI code).
>
> In terms of platforms where there /is/ an accessibility layer provided by the system (e.g. UI Automation on Windows, or ARIA in browsers) I think those accessibility layers generally have similar semantics, so it seems to me that the goal is just making sure that bridge to the game itself is there.

Right, and from an architectural perspective at some point (in the 
future), there may be AT available 'in game' that will provide the user 
with what they need. This is happening already with skins in games for 
low vision users and will only progress. But we are not there yet..

There is a also a performance hit implication - that's why the Luke 
Wagner/WASM proposal URI was so useful, as they are improving memory 
management to streamline API calls and this can potentially improve 
accessibility for current AT users.


> Existing accessibility layers are geared towards access to the UI, so they could help with CVAA (and similar) compliance and getting into a game. When in the game, it's much more game-dependent as to whether the use of different rendering types/effects might be enough to convey the game to players, or whether any further semantic info may be needed.

Interesting, and relates to 'modality muting' (see below).


> But (and I think this may be the central part of what you're asking) what about the games—are they coded in such a way that this model of object-oriented UI accessiblity we're used to actually fits?

Right, and I guess I'm asking the (open) question.. Is a semantically 
correct objected orientated language within a compiled environment good 
for accessibility? My gut is that it is. When we have classes, and 
instances that can inherit from a parent - and be semantically useful 
IMO we have a suitable landscape to support much needed accessibility 
semantics and relationships, states and properties. Currently this kind 
of thing can be scripted, and JavaScript can be written in an OO way to 
support this - but it will require some skills/chops to do so with a 
potentially massive overhead.


> It's my expectation that UI code in games will have become more object-oriented over time, and probably even library code is provided by the engines/middleware out there.

Interesting!


>   Assuming that the UI code is relatively object-oriented, it could match up quite nicely with the DOM/ARIA approach. This is something I'm looking forward to exploring now I'm back at the keyboard.

I'm happy to kick this around *grin.


> How close that match is would determine how much work the accessibility code (whether added as a library to the game's codebase before compilation to WebAssembly, or in a JS library in the browser) will need to be. It doesn't feel like there's anything insurmountable here; I'll report back when I've gained some experience with this.

Aun aprendo.


>> Which has the best potential for semantic support and communication with platform and browser APIs?
> I think I've strayed into this topic above; let me know if I've not answered it clearly.
>
>> You mention glTF, and I'm not totally sure how that fits into the stack? It doesn't seem to be a full browser runtime environment like WebAssembly, but enables the loading of 3D scenes and models. So my question is around your references to the benefits of 'machine-readable applications' and how this could be good for accessibility? Do glTF files have inherent support for object description or other meta-data can provide an accessibility architecture when loaded? I just don't know much about this.
> I beleive you're right in that glTF prescribes a minimal environment enough to display the model/scene, in a smilar way to the HTML <video> or <audio> elements might work: the browsr provides the UI and does the rendering.
>
> My understanding is that there are optional extra payloads that could be sent along with the textures and the model. These may include e.g. a JSON file that provides some semantic information on the model.


Right, this is what I was wondering, and als brings up the issue of user 
agents parsing semantic descriptions in JSON - if this is possible that 
also beneficial for WoT. So there is a 'semantic accessibility payload' 
that can be delivered via glTF. That makes sense - but again - this is 
like adding on a suitable semantic structure after the fact. If the 
authoring environment provided the ability for even semantic primitives 
that could be presented at runtime, this would obviate the need for 
heavy lifting and tons of coding when applying the 'semantically 
accessibility payload' within glTF.

It sounds like it could be large overhead on the author - rather like 
retrofitting ARIA to a canvas type environment on steroids. However, tis 
good to know that my understanding of glTF as a delivery mechanism for 
this kind of accessibility architecture doesn't seem to be off base *grin.


> I think that discussion is/will be on-going between WebXR and Kronos on what information that might be.
>
> (I'm a little tight for time at the moment, so haven't looked up the spec, but if I can find any further info/clarfication on this, I'll follow up later.)

Cool, please do!


>> Any finally, ..any more info you have on 'semantic-scene graph' modelling would be really helpful *grin.
> Very essentially, the idea of a semantic scene graph would involve including semantic info (of course, convention would dictate the meaning, so I can't give any concrete examples yet) behind the rendered scene along with it, so it could perhaps be explored in a more mechanical manner. E.g. if a series of shapes together were intended to form a teapot, these could be grouped within the scene and labelled thusly. A-frame has a similar approach of using the DOM to create the scene graph rather than having it as an opaque structure in memory.

Ok, so it's a way to build an accessibility tree that describes the 
objects in the scene etc? Got it. And then adding that onto some 
rendered image. I wonder if that 'semantic-scene-graph' persists through 
all instances of a scene - if there is movement or motion of the object 
- so the user can interrogate and understand the object at all times? It 
looks like it can, I'm thinking of ascribing accessibility semantics to 
a 2D image - that can only be seen in a particular place - when focussed 
in the DOM. With 3D, things will be moved etc and dynamically rendered 
so any semantic architecture needs to 'follow' the object, if that makes 
sense.

It looks like it is the case - in this paper, they are talking about 
'semantic scene graphs' being used to infer relationships etc.

https://www.sciencedirect.com/science/article/abs/pii/S0924271617300746

> Luke Wagner also mentioned the idea of projecting imaginary lines from the player's perspective into the scene and, should the scene contain sufficient semantic info, this could be presented as the player explores. We actually did something very similar to provide AudioQuake players information on their environment (there we use sounds to indicate certain environmental features).

Wow that's really interesting! So like an accessibility 'line of 
sight'.. I guess that could help with overheads of rendering for 
particular modalities, and could be useful for the 'modality muting' 
model - which we can in a conversion in the RQTF, where unused 
modalities are turned off to save bandwidth, or rendering/processing 
cycles for user.

> Hope this helps—do let me know if I can provide any further clarification or info.

Super helpful Matthew, thanks.

Josh


> best regards,
>
>
> Matthew
>
> [1] https://github.com/matatk/agrip

-- 
Emerging Web Technology Specialist/Accessibility (WAI/W3C)
Received on Thursday, 22 August 2019 14:04:12 UTC