Re: Voice assistants means platform focus from Cameron Cundiff on 2019-03-17 (public-voice-assistant@w3.org from March 2019)

From: Cameron Cundiff <cameron@ckundo.com>
Date: Sun, 17 Mar 2019 13:31:59 -0400
To: Joseph K O'Connor <josephoconnor@mac.com>
Cc: lw@tetralogical.com, Claudio Luis Vera <claudio@simple-theory.com>, public-voice-assistant@w3.org
Message-Id: <ED12B39D-D2B7-46D4-8AA7-C57963D85B02@ckundo.com>
Hi Joe, thanks for sharing that.

I brought up AAC in my list of interests, exactly as in your example, in the context of a voice assistant. I joined the group based on my interest in accessibility of voice interfaces, and AAC is a part of that. I'm less focused on the technical aspects of AAC itself however, and more interested in how voice assistants can accommodate AAC.

Best,
Cameron

> On Mar 17, 2019, at 12:50 PM, Joseph K O'Connor <josephoconnor@mac.com> wrote:
> 
> Thanks Léonie. There are others who also are interested in working toward AAC standards. I want to channel our energy. Finding or creating the right workgroup to discuss these matters is essential.
> 
> Not to confuse matters, but my daughter asks Siri to play her music. Using a speech generating device (SGD) she says "Hey Siri, play the Beatles Yellow Submarine." and Siri does it. By controlling the voice assistant with her (SGD) she gains some small bit of independence. 
> 
> This is the essence of my mission: to promote independence. 
> 
> Joseph
> 
>> On Mar 17, 2019, at 4:08 AM, Léonie Watson <lw@tetralogical.com> wrote:
>> 
>> No-one is brushing anyone off here. I know Joe and have met his daughter, and understand his motivation very well.
>> 
>> We're just exploring possibilities here. All I can speak to is the reason the CG was created in the first place. As I noted in an earlier email in this thread though, it's up to the CG to decide where we want to take things, and if indeed we need more than one CG.
>> 
>> It may be that the Conversational UI CG is a better fit for AAC standardisation, it might not. It might be that it needs a CG of its own, it might not.  I don't have the answers to these questions.
>> 
>> Léonie.
>>> On 16/03/2019 18:58, Claudio Luis Vera wrote:
>>> Joe brings up a field that most of us who work in web accessibility are barely aware of. AAC allows people who are non-verbal to communicate through synthesized speech, and typically it's the only means available for verbal communications. AAC developers historically have worked with bespoke solutions, and manufacturers have not been diligent about backward and forward compatibility. This has left communicators like Joe's daughter Siobhan virtually helpless when their AAC system eventually fails.
>>> I would hate to see Joe's concerns being brushed off as off-topic in this forum. Instead, I think we should take a more holistic approach to voice UI. Today's voice assistants are conversational interfaces that are primarily geared at gathering voice input from a user, to return content from a remote source through speech.
>>> AAC reverses this challenge:  A user like Joe's daughter should have the most frictionless input means available, in order to select the words that will be output through synthesized speech. The best solutions would look at reducing friction and speeding up that process through any means possible (eyegaze, switches, autocomplete, AI, machine learning, e.g.) Today's AAC systems typically don't take advantage of smart technologies yet.
>>> In addition, Joe brings up a huge interoperability and portability challenge. A standardized approach like package.json for capturing configuration settings and dependencies would take care of many of these issues. I can't fathom that the other data could not be ported through a typical data migration as a volunteer hacking project.
>>> We really should broaden our approach so that portability and forward compatibility front and center, and that AAC and voice output is also included.
>>> On Sat, Mar 16, 2019 at 7:53 AM Léonie Watson <lw@tetralogical.com <mailto:lw@tetralogical.com>> wrote:
>>>   On 16/03/2019 12:53, Cameron Cundiff wrote:
>>>> From what I can tell, the original intention was to focus on
>>>   design of conversational interfaces with voice assistant platforms
>>>   specifically, as opposed to voice as an input mechanism, or core
>>>   text to speech and speech to text tech. Does that sound right Léonie?
>>>   More or less, yes. The idea was to look at whether we could come up
>>>   with
>>>   a way to code once and deploy across multiple platforms.
>>>   Léonie.
>>>> 
>>>> Best,
>>>> Cameron
>>>> 
>>>>> On Mar 16, 2019, at 7:30 AM, Joseph K O'Connor
>>>   <josephoconnor@mac.com <mailto:josephoconnor@mac.com>> wrote:
>>>>> 
>>>>> Interoperability of databases is my first goal.
>>>>> 
>>>>> Manufacturers of learning management systems (WebCT, Blackboard,
>>>   Desire to Learn are examples) have agreed to make courseware
>>>   interoperable. The standard is SCORM, Shareable Content Object
>>>   Reference Model.
>>>>> 
>>>>> At its core, SCORM allows content authors to distribute their
>>>   content to a variety of Learning Management Systems (LMS) with the
>>>   smallest headache possible. And for an LMS to handle content from a
>>>   variety of sources.
>>>>> 
>>>>> In the same way there is a need for users of AAC systems to load
>>>   the databases they have created on one system onto another system.
>>>>> 
>>>>> Told from the point of view of one communicator, some info about
>>>   AAC systems and possible areas where standards will help.
>>>>> 
>>>>> http://accessiblejoe.com/wizard/
>>>>> 
>>>>> Joseph
>>>>> 
>>>>>> On Mar 16, 2019, at 2:46 AM, Léonie Watson <lw@tetralogical.com
>>>   <mailto:lw@tetralogical.com>> wrote:
>>>>>> 
>>>>>> I don't know much about Alternative and Augmentitive
>>>   Communication (AAC) systems. Can you give us a simple description or
>>>   point to some good descriptions elsewhere?
>>>>>> 
>>>>>> Also, what would the standardisation look like for an AAC
>>>   system? What are the things that could be standardised?
>>>>>> 
>>>>>> 
>>>>>> Léonie.
>>>>>> 
>>>>>> 
>>>>>>> On 16/03/2019 02:42, Joseph K O'Connor, wrote:
>>>>>>> I'm interested in talking about standards for AAC systems. For
>>>   instance, databases are not interoperable, even between different
>>>   devices by the same manufacturer. This has very serious effects.
>>>   Each time my daughter has to switch devices we have to remake all
>>>   the grids, buttons, button behaviors, links between pages, find and
>>>   upload pictures of people she interacts with, and deal with subtle
>>>   changes introduced by the new software. Who will do this when we're
>>>   gone? I fear for her future.
>>>>>>> Thanks,
>>>>>>> Joe
>>>>>>>> On Mar 15, 2019, at 8:34 AM, Cameron Cundiff
>>>   <cameron@ckundo.com <mailto:cameron@ckundo.com>> wrote:
>>>>>>>> 
>>>>>>>> Thanks Léonie. I’ll chime in with my interests too.
>>>>>>>> 
>>>>>>>> I’m curious to find emergent practices in Voice UI design,
>>>   and figure out how to document and influence them.
>>>>>>>> 
>>>>>>>> Examples include: how to offer non-verbal alternatives to
>>>   speech input for non-verbal users; expectations for accent support
>>>   and internationalization; accommodations for AAC users and delayed
>>>   speech; volume controls and defaults; enabling and disabling speech
>>>   input and playback. To name a few.
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Cameron
>>>>>>>> 
>>>>>>>>> On Mar 15, 2019, at 11:16 AM, Léonie Watson
>>>   <lw@tetralogical.com <mailto:lw@tetralogical.com>> wrote:
>>>>>>>>> 
>>>>>>>>> I think the original reason for this CG was to explore
>>>   standardisation across the different voice assistants.
>>>>>>>>> 
>>>>>>>>> This was in part an attempt to avoid the enduring problem
>>>   already evident with native mobile development: cross-platform
>>>   production is costly and complicated.
>>>>>>>>> 
>>>>>>>>> There is also a counterpart in the UI that is far more
>>>   common than it is for mobile: the burden of learning and swapping
>>>   between assistants is high, but because of the significant
>>>   differences in their capabilities, it's increasingly common to find
>>>   households with devices from multiple providers.
>>>>>>>>> 
>>>>>>>>> That doesn't mean the CG needs to continue along this path,
>>>   though we might need a name change if we alter course!
>>>>>>>>> 
>>>>>>>>> Phil, can you describe more about the things you mentioned?
>>>   I'm not quite sure I understood the sort of thing you'd like the CG
>>>   to explore.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Perhaps with all the possibilities, it would help to throw
>>>   some suggestions out as to the deliverables we might produce?
>>>>>>>>> 
>>>>>>>>> Léonie.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On 15/03/2019 14:33, Phil Archer wrote:
>>>>>>>>>> I don't speak for others but for my own POV we're not
>>>   talking about
>>>>>>>>>> established voice assistants like the ones you mention, no.
>>>   My own
>>>>>>>>>> interest - and I'm being led by Brian Subirana - is on
>>>   talking to/about
>>>>>>>>>> products ('cos GS1 is about commerce). Things like wake
>>>   words that can
>>>>>>>>>> be referenced - Brain might be able to jump in and say more.
>>>>>>>>>> But to come to your point - I'd certainly be interested in
>>>   voice UI in
>>>>>>>>>> general, not specifically voice assistants.
>>>>>>>>>> Phil
>>>>>>>>>>> On 15/03/2019 14:17, Cameron Cundiff wrote:
>>>>>>>>>>> Hi folks,
>>>>>>>>>>> 
>>>>>>>>>>> Thinking about our focus on voice assistants and the
>>>   limits of that.
>>>>>>>>>>> 
>>>>>>>>>>> I think conversational interfaces are a narrow subset of
>>>   voice UI, are platform specific in implementation and design, and
>>>   are limited modalities compared to generalized voice commands.
>>>>>>>>>>> 
>>>>>>>>>>> It’d be easier, in my opinion, to talk about standards for
>>>   Voice UI than specifically assistants, because these assistants
>>>   operate with different mental models compared to one another.
>>>>>>>>>>> 
>>>>>>>>>>> Is this CG exclusively focused on Alexa, Google Assistant,
>>>   Siri etc, or can it reach into general voice input for AR and VR,
>>>   web, apps, etc?
>>>>>>>>>>> 
>>>>>>>>>>> Is it limited to conversational interfaces, or can it
>>>   include single turn commands, earcons, and speech playback?
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Cameron
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Phil Archer
>>>>>>>>>> Director, Web Solutions, GS1
>>>>>>>>>> https://www.gs1.org
>>>>>>>>>> http://philarcher.org
>>>>>>>>>> +44 (0)7887 767755
>>>>>>>>>> @philarcher1
>>>>>>>>>> Skype: philarcher
>>>>>>>>>> CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail
>>>   are  confidential and are not to be regarded as a contractual offer
>>>   or acceptance from GS1 (registered in Belgium).
>>>>>>>>>> If you are not the addressee, or if this has been copied or
>>>   sent to you in error, you must not use data herein for any purpose,
>>>   you must delete it, and should inform the sender.
>>>>>>>>>> GS1 disclaims liability for accuracy or completeness, and
>>>   opinions expressed are those of the author alone.
>>>>>>>>>> GS1 may monitor communications.
>>>>>>>>>> Third party rights acknowledged.
>>>>>>>>>> (c) 2016.
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> @TetraLogical TetraLogical.com
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> @TetraLogical TetraLogical.com
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>   --     @TetraLogical TetraLogical.com
>>> -- 
>>> User Experience | Information Architecture | Accessibility
>>> simple-theory.com <https://simple-theory.com/>
>>> +1 954-417-4188
>> 
>> -- 
>> @TetraLogical TetraLogical.com
>> 
> 
>
Received on Sunday, 17 March 2019 17:32:27 UTC