First review of APA WebRTC use cases - one by one from Joshue O Connor on 2019-05-17 (public-apa@w3.org from May 2019)

From: Joshue O Connor <joconnor@w3.org>
Date: Fri, 17 May 2019 14:27:47 +0100
To: Dominique Hazael-Massieux <dom@w3.org>
Cc: "public-apa@w3.org" <public-apa@w3.org>, "w3t-archive@w3.org" <w3t-archive@w3.org>
Message-ID: <068673e5-dbf7-587c-28a4-d9f2db0f6ae8@w3.org>
Hi Dom,

Thanks for taking the time to review the use cases.

For anyone wishing to comment extensively for any given use case, please 
create a separate thread but to get the ball rolling comments inline.

This is excellent feedback from Dom and to avoid this mail getting 
longer than necessary. I've updated the use case doc with relevant parts 
from his comments, added as a notes for each one (which may make it 
easier for APA members to parse.)

You can get that here:

https://www.w3.org/WAI/APA/wiki/WebRTC_next_version_use_cases

So after a certain point below, you will just find no more comments from 
me but you can still read Dom's feedback/suggestions etc in toto.


> More detailed feedback and analysis below.
>
> * Identify Caller
> There are 2 possible level of identity mechanisms in WebRTC-based services:
>   A- those entirely managed by the application itself
>   B- those managed by the browser via
> https://w3c.github.io/webrtc-identity/identity.html
>
> For case A, making the identity display and call notification accessible
> is a matter of making the app accessible.
>
> For case B, it would be a case of ensuring the browser itself is
> accessible (although in practice, there is currently very little
> adoption of that identity mechanism).
>
> Thus my impression is that this particular use case doesn't generate new
> requirements for the WebRTC specs per se.

That's good to know. Any cases that the group feels can be removed from 
this list due to situations like this, will be before presenting to the 
WebRTC group. I'm curious if we need to co-ordinate with another group 
that manages identity mechanisms in the browser, if doing so supports 
our overall use case?


> * Multiple Audio
>
> I think the Audio Output Devices API fulfills most of the capabilities
> aspect of the use case:
>    https://www.w3.org/TR/audio-output/
>
> One remaining aspect it doesn't handle reliable is how to authorize
> access to these additional output devices
> https://github.com/w3c/mediacapture-output/issues/2
>
> For the rest, it is (again) a matter of individual apps and services to
> make use of that capability to enable the use case.

Noted, we can discuss authorisation access to output devices.

> * Communications control of alternate content
> I think most of this again should be doable with existing technologies,
> in particular (in the case where the displays are not under the control
> of a single OS) via the second screen APIs
> https://www.w3.org/2014/secondscreen/
>
> But this probably deserves a more detailed analysis to confirm this.

Note, we can look into this.

> * Text communication data channel
> I don't understand enough how the various devices mentioned in this use
> cases are linked one with another to tell how much of this is already
> doable; but it is looks almost certain to me that this issue is not
> WebRTC-specific (i.e. it would apply to any other technology enabling
> text chat, e.g. Web Sockets)


I think you are right, and we need to identify what part WebRTC plays in 
this evolution, as my sense is that WebRTC could become a more prominent 
part of the stack for this (and alternate content forms/transmission).


> * Control relative volume and panning position for multiple audio
> WebRTC, Web Audio and HTML Audio all offer mechanisms to manage volume,
> and Web audio adds mechanisms to manage panning; but I think I would
> need to better understand which of the browser vs OS vs app would be
> configured by the user to understand if this would generate new
> requirements (for WebRTC, Web Audio or HTML).

Excellent. We will discuss. My sense is that when dealing with alternate 
forms/content in the browser - thinks like relative volume of multiple 
streams, and panning will be configured there somehow, but am not at 
this point sure of the mechanism. It seems currently rather basic.


> * Live Transcription/Captioning
> I think all the browser APIs needed to implement this are available; but
> again, this is left to a service-decision basis, without a clear or easy
> way to integrate e.g. with third-party services (e.g. for sign language
> translation).


Right, so this highlights your point about some of these use cases 
potentially being facilitated by better service integration.  I do think 
that WebRTC itself could become the platform for some of these services, 
and we need to explore that. As services like alternate channels, or 
tracks can be added as third party things, or plug ins - but it seems to 
be that WebRTC has a potent architecture to become the facilitator for 
these services itself.

> * Quality Synchronisation and playback
> Right now, there is no dedicated mechanism to transmit captions or audio
> descriptions in sync with WebRTC audio and video streams; the current
> assumption is that using data channels with an out-of-band
> synchronization mechanism gives good-enough results, but I don't know
> how well that has been demonstrated.


Ok, this is then something that we need to look at and this could 
represent a good a11y use case for the WebRTC group.


> There have been discussions in enabling a firmer way for synchronization
> based on the RTT standard from IETF (RFC 4103) - it probably wouldn't be
> all that difficult from a technical perspective, but it hasn't been
> clear so far whether there was enough demand for it given the
> "good-enough" above.
> https://lists.w3.org/Archives/Public/public-webrtc/2018Jun/0140.html
>
> It would be useful if there was some research results we could reference
> that would show if the current possibilities are indeed good enough or not.

> * Simultaneous Voice, Text & Signing
> I think I covered the relevant points on this in "Quality
> Synchronisation and playback" and "Live Transcription/Captioning"
>
> * Support for Real Time Text (RTT)
> Regarding RTT per se, see above. I don't know if there are any
> WebRTC-based systems that offer emergency-call integration today - it
> may be useful to research this since that may indeed bring up new
> requirements on WebRTC (beyond accessibility); I know some requirements
> of the existing specs were derived from such a scenario (e.g. disabling
> voice-activity-detection), but I don't know how thoroughly the scenario
> was analyzed.
> * Support for Video Relay Services and (VRS) and Remote Interpretation (VRI)
> I think for this one, the main question is one of integration /
> interoperability with third-party services; I noticed the IETF has
> looked at standardizing a way to use SIP with VRS services:
>    https://tools.ietf.org/html/draft-rosen-rue-00
>
> One question that this raised was the impact of such services on
> end-to-end security (since these services are*by design*  equivalent to
> man-in-the-middle attacks).
>
>
> * Distinguishing Sent and Received Text
> As mentioned in "Text communication data channel", there are many
> protocols and APIs that can be used to exchange text messages beyond
> WebRTC; so that one is definitely not WebRTC specific, and likely
> depends on applications doing the right thing (it may be useful to
> document what "the right thing" would be e.g. in terms of ARIA semantics).
>
>
> * Warning and recovery of lost data
> WebRTC Statisticshttps://www.w3.org/TR/webrtc-stats/  should enable any
> WebRTC-based app to signal to its users when there is network issues;
> I'm not sure there can be more standardized mechanisms for signaling
> these issues, or to recover from them since both would be very heavily
> application dependent (it would probably be a bad idea to signal all
> network losses since these are more or less guaranteed to happen very
> frequently).
>
>
> * Call status data and polling
> All of these features can be implemented with the current APIs - making
> the information available to screen-readers would be a matter of the
> application being developed correctly.
>
>
> * answered call / busy signal
> (this doesn't appear under a scenario heading - not sure if that's
> intentional?)
> Again, these are things that would need to be implemented by the
> application UI - detecting these states is made possible by WebRTC APIs,
> but representing them is up to each application.
>
>
> * Bandwidth for audio / Bandwidth for video / Quality of video
> resolution and frame rate
> WebRTC lets applications prioritize the bandwidth dedicated to audio /
> video / data streams; there is also some experimental work in signaling
> these needs to the network layer:https://w3c.github.io/webrtc-dscp-exp/
>
> There is also support for prioritizing framerate over resolution in case
> of congestion.
>
> Beyond that, WebRTC supports state of the arts codecs both audio and
> video, which (if implemented correctly) should provide the best possible
> audio / video quality in the given networking conditions.
>
> (again, this relies on WebRTC services doing the right thing)
>
>
> * Assistance for Older Users or users with Cognitive disabilities
> This sounds very application specific.
>
>
> * Personalised Symbol sets for users with Cognitive disabilities
> Given that the WebRTC specs do not standardize any of the UI that WebRTC
> apps can build, I think this too would be application specific.

Thanks

Josh

>
>
> HTH,
>
> Dom
>
> 1.
> https://www.w3.org/Team/wiki/Joconnor/WebRTC_use_cases_APA_review#User_Needs_and_Scenarios

-- 
Emerging Web Technology Specialist/A11y (WAI/W3C)
Received on Friday, 17 May 2019 13:26:51 UTC