RE: VoiceXML 2 and multimodal input from Kenneth Rehor on 2002-05-23 (www-voice@w3.org from April to June 2002)

From: Kenneth Rehor <ken@nuance.com>
Date: Thu, 23 May 2002 16:57:59 -0700
To: "Aldebaro Klautau" <a.klautau@ieee.org>, <www-voice@w3.org>
Message-ID: <CDC9B77B2FE039409212EF70BB97F5F40266BC1A@mpb1exch01.nuance.com>

> -----Original Message-----
> From: Aldebaro Klautau [mailto:aklautau@ucsd.edu]
> Sent: Thursday, May 23, 2002 4:08 PM
> To: www-voice@w3.org
> Subject: VoiceXML 2 and multimodal input
>
> Could someone please tell me if VoiceXML 2 supports multimodal input? 

Yes.

By design, VoiceXML supports "spoken or character input" (see Sect 1.2.1).
While "character input" was never meant to be limited to DTMF keypad input, 
the telephony-only VoiceXML implementations typically accept only DTMF input
(0-9, *, #, plus the "extra" 4 DTMF signals originally only available on 
"AUTOVON" phones -- these characters are usually designated "A B C D").

The <grammar> element currently supports "speech" or "dtmf" types, though 
others could be supported.

I think the larger question is one of application requirements and system 
design:  what are the components, how are they interconnected, and how 
do you program them?

> For example, a page would allow an Internet user to provide information by
> either typing and / or speaking (someone in favor of SALT told me that this
> is not possible with VoiceXML).
> 
> Thanks,
> Aldebaro


Several companies have demonstrated multimodal applications and systems 
that accept input via touchscreen and voice, with screen and audio output,
using VoiceXML to control the voice dialog in a variety of configurations.

Ken

Received on Thursday, 23 May 2002 19:59:26 UTC