Mismatches between semantic results and VoiceXML fields

Hi all,

I would like to suggest a change in the VoiceXML-specification.
I refer to section 3.1.6.4, point 3: "Mismatches between semantic
results and VoiceXML fields" of the VoiceXML-2.0-specification [VXML2.0].

This concerns the following two cases of inconsistency between the
grammar slots of a form level grammar and the fields in the form:

a) the form level grammar allows utterances which fill only slots that
don‘t correspond to fields in the form.
b) the form level grammar allows utterances which don’t fill any slots
at all.

(see below for an example.)

If such an utterance is recognized, no field is filled, but also no
nomatch handler is executed, since the grammar was matched.
According to [VXML2.0] Section 3.1.6.4 item 3, it is still in discussion
how to deal with this case.

Currently the behaviour of the FIA indicated by [VXML2.0] is to just
ignore the callers input and execute the same form element again. This
amounts to repeating the same prompt until the caller either answers
differently or hangs up.

I think that a mismatch as described above always indicates a developers
error or at least sloppy programming. With the current platform
behaviour the developer has no possibility to log such a mismatch,
because no event handler is executed. There would be no indication why
the caller hung up.

Therefore I recommend that the platform throws specific error events,
when it detects a mismatch as described above.
They could be called for example:
a) error.semantic.badutterance.infounhandled
b) error.semantic.badutterance.noinfo

This would allow the VoiceXML developer to log the mismatches and take
care to eliminate them by modifying either the grammar or the dialog.

To avoid the impression that the dialog loops, the event handler could 
provide a special prompt like: „Your utterance contained no information 
I can handle.“, or it could just reprompt in a more specific way.
Anyway this will probably not lead to very elegant dialogs, but my main 
point is not to hide a sloppyness in the dialog design by a good event 
handler, but to make such mismatches detectable at all.

I don't know why the VoiceXML commitee was so undecided about this point.
If anybody has a realistic example, where the suggested events would be
a cutback to the VoiceXML-developer, please let me know.

Best regards,

Claudia Daboul

------------------------------------------------------------------------
Example:

case a)
A dialog for ordering a digital camera might have fields for different 
features,like

format, price, resolution etc.

A mixed initiative form could start with the initial prompt:

"What features should your camera have?"

Allowed utterances might be

"I want a compact digital camera which costs less than 200 dollars."
"At least 5 mega pixels."
etc.

Now suppose that the grammar allows a camera feature, say "manual 
focus", that the dialog doesn't handle, maybe because the shop's range 
of products doesn't contain a model with this feature.

Then the dialog could proceed like this:

Computer:  "What features should your camera have?"

Caller: "I want a camera with manual focus"

Computer (remaining in the same state as before): "What features should 
your camera have?"

Caller (if he thinks that the computer did not understand):  "manual 
focus", or
Caller (if he thinks that the computer understood his first wish):  "it 
should also have at least 5 mega pixels"

In both cases the caller will in the end be frustrated with the dialog, 
and the operator of the dialog has no means of detecting the reason why 
the caller hangs up or gets confused.

case b) (no slots filled) is usually less dramatic.
For example the grammar might allow an utterance like:
"I want a digital camera."
which doesn't fill any slot because the shop has only digital cameras 
anyway.
In this case it might be sufficient to answer with a slightly different 
prompt. Still it would be a good idea to monitor the frequency of such 
utterances to decide if the dialog should be changed.

Received on Saturday, 2 April 2005 00:55:04 UTC