RE: Incremental recognition, Unobtrusive response from Deborah Dahl on 2016-11-15 (public-voiceinteraction@w3.org from November 2016)

From: Deborah Dahl <Dahl@conversational-Technologies.com>
Date: Mon, 14 Nov 2016 22:17:30 -0500
To: "'David Pautler'" <david@intentionperception.org>, <public-voiceinteraction@w3.org>
Message-ID: <0db301d23eee$cd3225c0$67967140$@conversational-Technologies.com>

Hi David,

Thanks for your comments. 

This sounds like a great use case. EMMA 2.0 [1] provides some capability for incremental inputs and outputs, but I think that’s only a building block for the whole use case because given incremental input and output, it’s still necessary for the system to figure out how to respond. Also, the Web Speech API [2] has incremental output for speech recognition. Again, that’s just a building block. 

It would be very interesting if you could post a more detailed description of this use case to the list, and if you have a proposal that would be interesting, too. 

If you have links to SARA and MACH, that would also be helpful. 

Best,

Debbie

[1] https://www.w3.org/TR/emma20/

[2] https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html 

 

From: David Pautler [mailto:david@intentionperception.org] 
Sent: Monday, November 14, 2016 8:06 PM
To: public-voiceinteraction@w3.org
Subject: Incremental recognition, Unobtrusive response

 

There are several multimodal virtual agents like MACH and SARA that provide partial interpretation of what the user is saying or expressing facially ("incremental recognition") as well as backchannel 'listener actions' ("unobtrusive response") based on those interpretations. This style of interaction is much more human-like than the strictly turn-based style of Vxml (and related W3C specs) and of all chatbot platforms I'm aware of.

Is this interaction style (which might be called "IRUR") among the use cases of any planned update to a W3C spec?

Cheers,
David

Received on Tuesday, 15 November 2016 03:18:01 UTC