Re: First review of APA WebRTC use cases - one by one from Joshue O Connor on 2019-05-20 (public-apa@w3.org from May 2019)

From: Joshue O Connor <joconnor@w3.org>
Date: Mon, 20 May 2019 11:15:42 +0100
To: Shadi Abou-Zahra <shadi@w3.org>, Dominique Hazael-Massieux <dom@w3.org>
Cc: "public-apa@w3.org" <public-apa@w3.org>, "w3t-archive@w3.org" <w3t-archive@w3.org>
Message-ID: <a2741ca5-b095-5b87-5ac0-c35052c74f99@w3.org>
Hi Shadi,

On 19/05/2019 09:44, Shadi Abou-Zahra wrote:
> Hi Josh, Dom, APA,
>
> This is a very interesting exchange.

+1 Doms input has been really useful.


> On a higher-level, it seems to me that some of (parts) these use cases 
> need to be addressed on a spec-level (not only WebRTC but maybe also 
> other specs, such as media specs etc.) and others on an app-level. Yet 
> these use cases still remain important to address for the end-user.
>
> I think it is actually good to continue thinking about the use cases 
> from the end-user perspective in APA, then figuring out how different 
> parts of these user needs map to which specs. That is, maybe we should 
> not change these use cases to match the needs of WebRTC, but create a 
> mapping for WebRTC from these "master use cases" for accessibility.


Yes, something like that. I think we need to discuss further in APA and 
work where the home for some of these use cases will be in the long 
term. In the short term, we need to do a pass to figure out what can 
taken off the table, as something that WebRTC may not need to address 
directly at the moment. We can for sure look at the use cases and create 
some horizontal mapping between that and the various specs and 
technologies that are needed to support it.

> Maybe some of the app-level user needs need to go AGWG (possibly for 
> Silver considerations), so we should not lose them along the way.

Noted, thanks

Josh


>
> Best,
>   Shadi
>
>
> On 17/05/2019 15:27, Joshue O Connor wrote:
>> Hi Dom,
>>
>> Thanks for taking the time to review the use cases.
>>
>> For anyone wishing to comment extensively for any given use case, 
>> please create a separate thread but to get the ball rolling comments 
>> inline.
>>
>> This is excellent feedback from Dom and to avoid this mail getting 
>> longer than necessary. I've updated the use case doc with relevant 
>> parts from his comments, added as a notes for each one (which may 
>> make it easier for APA members to parse.)
>>
>> You can get that here:
>>
>> https://www.w3.org/WAI/APA/wiki/WebRTC_next_version_use_cases
>>
>> So after a certain point below, you will just find no more comments 
>> from me but you can still read Dom's feedback/suggestions etc in toto.
>>
>>
>>> More detailed feedback and analysis below.
>>>
>>> * Identify Caller
>>> There are 2 possible level of identity mechanisms in WebRTC-based 
>>> services:
>>>   A- those entirely managed by the application itself
>>>   B- those managed by the browser via
>>> https://w3c.github.io/webrtc-identity/identity.html
>>>
>>> For case A, making the identity display and call notification 
>>> accessible
>>> is a matter of making the app accessible.
>>>
>>> For case B, it would be a case of ensuring the browser itself is
>>> accessible (although in practice, there is currently very little
>>> adoption of that identity mechanism).
>>>
>>> Thus my impression is that this particular use case doesn't generate 
>>> new
>>> requirements for the WebRTC specs per se.
>>
>> That's good to know. Any cases that the group feels can be removed 
>> from this list due to situations like this, will be before presenting 
>> to the WebRTC group. I'm curious if we need to co-ordinate with 
>> another group that manages identity mechanisms in the browser, if 
>> doing so supports our overall use case?
>>
>>
>>> * Multiple Audio
>>>
>>> I think the Audio Output Devices API fulfills most of the capabilities
>>> aspect of the use case:
>>>    https://www.w3.org/TR/audio-output/
>>>
>>> One remaining aspect it doesn't handle reliable is how to authorize
>>> access to these additional output devices
>>> https://github.com/w3c/mediacapture-output/issues/2
>>>
>>> For the rest, it is (again) a matter of individual apps and services to
>>> make use of that capability to enable the use case.
>>
>> Noted, we can discuss authorisation access to output devices.
>>
>>> * Communications control of alternate content
>>> I think most of this again should be doable with existing technologies,
>>> in particular (in the case where the displays are not under the control
>>> of a single OS) via the second screen APIs
>>> https://www.w3.org/2014/secondscreen/
>>>
>>> But this probably deserves a more detailed analysis to confirm this.
>>
>> Note, we can look into this.
>>
>>> * Text communication data channel
>>> I don't understand enough how the various devices mentioned in this use
>>> cases are linked one with another to tell how much of this is already
>>> doable; but it is looks almost certain to me that this issue is not
>>> WebRTC-specific (i.e. it would apply to any other technology enabling
>>> text chat, e.g. Web Sockets)
>>
>>
>> I think you are right, and we need to identify what part WebRTC plays 
>> in this evolution, as my sense is that WebRTC could become a more 
>> prominent part of the stack for this (and alternate content 
>> forms/transmission).
>>
>>
>>> * Control relative volume and panning position for multiple audio
>>> WebRTC, Web Audio and HTML Audio all offer mechanisms to manage volume,
>>> and Web audio adds mechanisms to manage panning; but I think I would
>>> need to better understand which of the browser vs OS vs app would be
>>> configured by the user to understand if this would generate new
>>> requirements (for WebRTC, Web Audio or HTML).
>>
>> Excellent. We will discuss. My sense is that when dealing with 
>> alternate forms/content in the browser - thinks like relative volume 
>> of multiple streams, and panning will be configured there somehow, 
>> but am not at this point sure of the mechanism. It seems currently 
>> rather basic.
>>
>>
>>> * Live Transcription/Captioning
>>> I think all the browser APIs needed to implement this are available; 
>>> but
>>> again, this is left to a service-decision basis, without a clear or 
>>> easy
>>> way to integrate e.g. with third-party services (e.g. for sign language
>>> translation).
>>
>>
>> Right, so this highlights your point about some of these use cases 
>> potentially being facilitated by better service integration.  I do 
>> think that WebRTC itself could become the platform for some of these 
>> services, and we need to explore that. As services like alternate 
>> channels, or tracks can be added as third party things, or plug ins - 
>> but it seems to be that WebRTC has a potent architecture to become 
>> the facilitator for these services itself.
>>
>>> * Quality Synchronisation and playback
>>> Right now, there is no dedicated mechanism to transmit captions or 
>>> audio
>>> descriptions in sync with WebRTC audio and video streams; the current
>>> assumption is that using data channels with an out-of-band
>>> synchronization mechanism gives good-enough results, but I don't know
>>> how well that has been demonstrated.
>>
>>
>> Ok, this is then something that we need to look at and this could 
>> represent a good a11y use case for the WebRTC group.
>>
>>
>>> There have been discussions in enabling a firmer way for 
>>> synchronization
>>> based on the RTT standard from IETF (RFC 4103) - it probably 
>>> wouldn't be
>>> all that difficult from a technical perspective, but it hasn't been
>>> clear so far whether there was enough demand for it given the
>>> "good-enough" above.
>>> https://lists.w3.org/Archives/Public/public-webrtc/2018Jun/0140.html
>>>
>>> It would be useful if there was some research results we could 
>>> reference
>>> that would show if the current possibilities are indeed good enough 
>>> or not.
>>
>>> * Simultaneous Voice, Text & Signing
>>> I think I covered the relevant points on this in "Quality
>>> Synchronisation and playback" and "Live Transcription/Captioning"
>>>
>>> * Support for Real Time Text (RTT)
>>> Regarding RTT per se, see above. I don't know if there are any
>>> WebRTC-based systems that offer emergency-call integration today - it
>>> may be useful to research this since that may indeed bring up new
>>> requirements on WebRTC (beyond accessibility); I know some requirements
>>> of the existing specs were derived from such a scenario (e.g. disabling
>>> voice-activity-detection), but I don't know how thoroughly the scenario
>>> was analyzed.
>>> * Support for Video Relay Services and (VRS) and Remote 
>>> Interpretation (VRI)
>>> I think for this one, the main question is one of integration /
>>> interoperability with third-party services; I noticed the IETF has
>>> looked at standardizing a way to use SIP with VRS services:
>>>    https://tools.ietf.org/html/draft-rosen-rue-00
>>>
>>> One question that this raised was the impact of such services on
>>> end-to-end security (since these services are*by design* equivalent to
>>> man-in-the-middle attacks).
>>>
>>>
>>> * Distinguishing Sent and Received Text
>>> As mentioned in "Text communication data channel", there are many
>>> protocols and APIs that can be used to exchange text messages beyond
>>> WebRTC; so that one is definitely not WebRTC specific, and likely
>>> depends on applications doing the right thing (it may be useful to
>>> document what "the right thing" would be e.g. in terms of ARIA 
>>> semantics).
>>>
>>>
>>> * Warning and recovery of lost data
>>> WebRTC Statisticshttps://www.w3.org/TR/webrtc-stats/  should enable any
>>> WebRTC-based app to signal to its users when there is network issues;
>>> I'm not sure there can be more standardized mechanisms for signaling
>>> these issues, or to recover from them since both would be very heavily
>>> application dependent (it would probably be a bad idea to signal all
>>> network losses since these are more or less guaranteed to happen very
>>> frequently).
>>>
>>>
>>> * Call status data and polling
>>> All of these features can be implemented with the current APIs - making
>>> the information available to screen-readers would be a matter of the
>>> application being developed correctly.
>>>
>>>
>>> * answered call / busy signal
>>> (this doesn't appear under a scenario heading - not sure if that's
>>> intentional?)
>>> Again, these are things that would need to be implemented by the
>>> application UI - detecting these states is made possible by WebRTC 
>>> APIs,
>>> but representing them is up to each application.
>>>
>>>
>>> * Bandwidth for audio / Bandwidth for video / Quality of video
>>> resolution and frame rate
>>> WebRTC lets applications prioritize the bandwidth dedicated to audio /
>>> video / data streams; there is also some experimental work in signaling
>>> these needs to the network layer:https://w3c.github.io/webrtc-dscp-exp/
>>>
>>> There is also support for prioritizing framerate over resolution in 
>>> case
>>> of congestion.
>>>
>>> Beyond that, WebRTC supports state of the arts codecs both audio and
>>> video, which (if implemented correctly) should provide the best 
>>> possible
>>> audio / video quality in the given networking conditions.
>>>
>>> (again, this relies on WebRTC services doing the right thing)
>>>
>>>
>>> * Assistance for Older Users or users with Cognitive disabilities
>>> This sounds very application specific.
>>>
>>>
>>> * Personalised Symbol sets for users with Cognitive disabilities
>>> Given that the WebRTC specs do not standardize any of the UI that 
>>> WebRTC
>>> apps can build, I think this too would be application specific.
>>
>> Thanks
>>
>> Josh
>>
>>>
>>>
>>> HTH,
>>>
>>> Dom
>>>
>>> 1.
>>> https://www.w3.org/Team/wiki/Joconnor/WebRTC_use_cases_APA_review#User_Needs_and_Scenarios 
>>>
>>
>
-- 
Emerging Web Technology Specialist/A11y (WAI/W3C)
Received on Monday, 20 May 2019 10:15:48 UTC