Bargein as defined in VoiceXML 2.0 and 2.1 - questions and comments

Dear WBWG

 

I did re-found this old thread [Evidently someone is confused as to what
"barge in" means.] about bargein and bargintype = "hotword", with no
proper answer from Committee. Is there such existing nowadays when 3.0
is in progress and even 2.1 is out .

 

What Dean in his mail express is actually exactly the same way I see
barge-in. And this was the way  how we implemented.  My interpretation
of bargein is : bargein happens, when user actively gives input during
barge-able media play and by doing this causes media play to terminate. 

My original thought was that "hotword" bargein differs from "speech"
bargein by the definition, what is treated as input that causes media
play to terminate, and has nothing to do with the outcome of collection.


 

With current definition as in VoiceXML 2.0 recommendations chapter
4.1.5.1 - "bargeintype" hotword declares more collection handling than
actually barge-in behavior. I guess that this should be cleared out
somehow.  

 

By reading the chapter and defined consequences of "hotword",  easily
leads us to think, that the outcome of user entering  incorrect input
during timeout period, is two ended.  ( and by my definition no barge-in
even occurred ) If bargeintype  "hotword" was used nothing happens and
noinput is thrown and if bargeintype "speech" was used nomatch is
thrown. 

 

Still it is defined for input, that  starting input during timeout
period causes timeout to cancelled and interdigit or termtimeout to be
used. Only exception in here is exact match with no termchar defined
that leads to immediate collection end. 

 

It would make sense to me if bargeintype "hotword " would only affect to
those collections that do _start_ during prompt play (bargein) and _end_
while prompt is still playing, or timeout period has not yet elapsed. In
case of non bargeable prompt(s) bargeintype property  would make no
difference since no prompt barge in may  occur and input is stared
earliest at the begin of timeout period.

 

I guess that this is the original idea with hotword, since in VoiceXML
it is quite easy to restart collection in case of <nomatch> but
bargeintype "hotword" is currently our only tool to prevent incorrect
input from interrupting prompt play. 

 

DTMF input that does not match any grammar will cause system to collect
more digits until  interdigit timeout is elapsed  and eventually throw
nomatch.  If bargeintype "hotword" is used, should the initial DTMF that
caused the system to go into this, be discarded. Or should only the
complete collection be discarded ? This is not defined but to make some
analogue with voice input collection, discarding the complete collection
sound better to me.

 

For example here are few examples from "hotword" barge-in case where
user may enter any number of DTMF "1" 

 

Here is timing sequence of case when caller keeps entering DTMF-1 past
the timeout period, and then presses DTMF-2. By the definition we were
not on timeout period anymore and nomatch should be thrown !)

 

NI    = NOINPUT

NM = NOMATCH

--IDT-- = Interdigit timeout period

 

| PROMPT PLAY      | TIMEOUT    | 

| Bargeintype="HW" |            |

--------------------------------------

    \--IDT-\--IDT-\--IDT-\--IDT-\--IDT-\--IDT-\--IDT--\

     DTMF-1 DTMF-1 DTMF-1 DTMF-1 DTMF-1 DTMF-1 DTMF-2  NOMATCH  

 

Here is another sequence, User starts entering incorrect sequence during
prompt play, since bargein input was  started during prompt play and
timeout was not elapsed  when the first input was completed,  collection
resulted to noinput.

 

| PROMPT PLAY      | TIMEOUT    | 

| Bargeintype="HW" |            |

--------------------------------------

    \--IDT-\--IDT--\--IDT--\    \

     DTMF-1 DTMF-1 DTMF-2  NM   NI

 

 

Here is yet another sequence,  Since  input was started during timeout
period it should be treated as "non" bargein type and follow normal
input collection and result to nomatch. 

           

| PROMPT PLAY      | TIMEOUT    | 

| Bargeintype="HW" |            |

--------------------------------------

                     \--IDT-\--IDT--\--IDT--\

                      DTMF-1 DTMF-1 DTMF-2  NM  

 

DTMF timing diagrams in VoiceXML specification wont contain any of these
hotword cases nor they won't contain any of failing cases either.
Defining those would clear up a lot. 

 

Was this the idea that You had in Your minds?   Or do you really mean
that with "hotword" there is no such thing as nomatch. (which then
limits VoiceXML developer quite much since it removes some vital
information about user input. ) and to barge prompt does not actually
mean giving input during prompt play. 

 

Just remember these when you define 3.0, currently the working draft is
such skeleton of open ideas that giving any comment about it is  quite
hard indeed.

 

BR

-          Teemu

Received on Tuesday, 17 February 2009 13:47:34 UTC