W3C home > Mailing lists > Public > www-voice@w3.org > January to March 2002

RE: Barge-in types in VoiceXML

From: <Jesper.Olsen@nokia.com>
Date: Fri, 4 Jan 2002 21:39:52 +0200
Message-ID: <58E9549287153543B1D95B91C332BD73680607@esebe016.NOE.Nokia.com>
To: ranjansharma@lucent.com, www-voice@w3.org
Speech and "noise" (or DTMF) signals may well have the same energy,
but they can be discriminated by looking at the energy distribution
in different frequency subbands;
a speech signal has its own characteristic energy distribution.

The last barge-in category (recognition) would be to let the recogniser
make
the decision whether or not a word has been spoken. 
In principle this allows a more informed decision to be made,
but unfortunately not without a certain delay...often the bargin
decision will arrive "too late", and the system appear to the user to be
sluggish.

Cheers
Jesper






> -----Original Message-----
> From: ext Sharma, Ranjan (Ranjan) [mailto:ranjansharma@lucent.com]
> Sent: 04 January, 2002 21:23
> To: www-voice@w3.org
> Subject: Barge-in types in VoiceXML
> 
> 
>  Hi,
>       In the VoiceXML 2.0 specifications, the barge-in types 
> enumerated are:
> 
>       energy, speech and recognition.
> 
> 	The description for both energy and speech reads the same:
> 	"The prompt will be stopped if speech or a DTMF tone is 
> detected."
> 
> 	I am not sure how would the distinction work and also, 
> conceptually,
> what is 
> 	the difference between energy and speech in this context? 
> 
>  Thanks,
>  Ranjan 
> 
Received on Friday, 4 January 2002 14:40:01 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 October 2006 12:48:54 GMT