- From: Moshe Yudkowsky <speech@pobox.com>
- Date: Thu, 22 Jun 2006 13:20:49 -0500
- To: www-voice@w3.org
Enhanced Model for CCXML State Management
My name is Moshe Yudkowsky; among other speech-related work, I develop
VoiceXML and CCXML applications. I have a proposal to make that will
improve how CCXML manages state variables. In particular, I will propose a
radical -- yet incremental -- change to the conceptual model of CCXML state
management.
Here's a very simple use case: a user calls into an application. The
application then does two things simultaneously:
(a) the application answers the phone to say "Welcome to the Acme Call
System."
(b) the application uses ANI/caller ID to check the identity of the caller
and see if the caller is authorized for use of the system.
This use case isn't far-fetched or unusual; many applications require
authorization, customization, etc. based on Caller ID. Similar use cases of
multitasking in CCXML are easy to find. (If you'd like an example of this
particular use case, see the open-source "Voice Conference Manager" at
http://vcm.sourceforge.net.)
The interesting part of this use case is that the application must manage
two different activities. The first activity is to answer the phone and
speak the dialog to the user. The second activity is to send an HTTP
request to check for authorization. Once these two separate activities
complete, the rest of the application logic will proceed.
The easiest and most foolproof way to handle each of these two activities
is through sub processes. I prefer to send the ANI/Caller ID to a separate
CCXML process which interacts with the database; this CCXML script sends
HTTP requests, waits for the result, handles error conditions, timeouts,
re-requests, etc. and sends the ultimate result back to the main CCXML
script. At the same time, the announcement to the user is handled via a
VoiceXML script, and when that script completes the result (event) is
returned to the main CCXML script.
Here's the problem: CCXML does not provide any facility to manage these two
simultaneous subprocesses. CCXML is an event-driven state machine, and it
only tracks one state at a time. However, the application has two states to
track: the state of the database lookup (did it return? what result did it
return?) and the state of the VoiceXML dialog (did it return?). And, of
course the original state, namely the state of the overall main script.
I solve this problem by using a state ("WAIT_FOR_ANSWERS") that has its own
private state variables ("DB_LOOKUP_STATE," "ANNOUNCE_STATE"). Instead of
using CCXML's facilities to manage my state variable, I have to build and
manage my own state information within the context of CCXML: When the
VoiceXML dialog finishes, I have to check to see if the authorization
lookup has finished; when the authorization lookup finishes, I have to
check if the VoiceXML dialog has finished. Please note that if there's a
significant gap between the end of dialog and the end of authorization, I
might even have process a timeout and insert a "one moment please" dialog
for the user, a further complication.
In my experience, private state management is both error prone and
difficult. Most of all, it's annoying -- after all, CCXML provides
perfectly good state management!
In my opinion, CCXML-based applications are not really state machines: They
are actually Petri nets. Unlike the assumptions of the current CCXML model,
it's not always a single event that drives the state transition; several
events may need to occur to transition to the next state (such as this use
case, where the application has to wait for the end of dialog and the
result of authorization).
Here's a possible incremental change to the CCXML state management system.
CCXML would continue to manage just one "main" state variable but would
allow more complex event conditions, as seen in Petri nets. One way CCXML
can incorporate Petri-net capabilities is by adding logic operators to the
transition element's "event" attribute. For example, "a + b" would mean
"pend until both event A and event B have been received." We'd also need a
way to express the idea of "pend until A and B both arrive, but as long as
no other event is received for this specific state." And certainly there
are other formulations ("A or B", "A and B but not C").
An alternative non-Petri net solution would be to designate multiple
variables as state variables -- more than one state variable could exist in
any given script. Transitions would be extended to include combinations of
state variables ("state variable X is A and state variable Y is B while
state variable Z is not C"). I don't know if that's workable or desirable,
or what mathematical model would represent such a system.
I don't pretend to be an expert on Petri nets, but I've found the basic
Petri net concept useful before in speech work. I am not insisting on a
full-blown implementation of a theoretical Petri net; I just want the parts
that are relevant for real-world CCXML applications. And I should probably
point out thatcolored Petri nets may provide the best overall solution.
Finally, a few words on SCXML. I haven't used SCXML in an application, of
course, but as far as I can tell it does offer advantages over CCXML, in
particular for managing some forms of parallel processes. I don't know if
SCXML is intended to handle the "A + B" logical combination of events the
way Petri nets would.
Some issues left over from the state-machine model of CCXML that haven't
been addressed by SCXML. For example, the SCXML "cancel" element (6.2.2),
due to the asynchronous nature of events, can easily generate
"error.notallowed," but because events and state changes are asynchronous
there's no simple way to know in advance when -- in what state -- the
script will receive this error.notallowed. As a result the script will
receive error.notallowed (probably in some catch-all transition) and cannot
determine whether to discard this particular "error.notallowed" as
irrelevant or instead to panic and exit. If the cancel event could be
"colored" (or perhaps "vectored" or "tagged" is the right description) and
sent only to the necessary states or automatically discarded by some states
-- and all that handled by the interpreter instead of custom-coded into the
script -- scripts would be far easier to write. With easier error handling,
scripts would become aware of errors, and scripts would therefore probably
become more reliable as well. In other words, even with SCXML, the notion
of colors from Petri nets may provide a conceptual framework for a solution.
--
Moshe Yudkowsky
Disaggregate
2952 W Fargo
Chicago, IL 60645 USA
Work: www.Disaggregate.com
Book: www.PebbleAndAvalanche.com
speech@pobox.com
+1 773 764 8727
Received on Friday, 23 June 2006 04:49:20 UTC