- From: Moshe Yudkowsky <speech@pobox.com>
- Date: Thu, 22 Jun 2006 13:20:49 -0500
- To: www-voice@w3.org
Enhanced Model for CCXML State Management My name is Moshe Yudkowsky; among other speech-related work, I develop VoiceXML and CCXML applications. I have a proposal to make that will improve how CCXML manages state variables. In particular, I will propose a radical -- yet incremental -- change to the conceptual model of CCXML state management. Here's a very simple use case: a user calls into an application. The application then does two things simultaneously: (a) the application answers the phone to say "Welcome to the Acme Call System." (b) the application uses ANI/caller ID to check the identity of the caller and see if the caller is authorized for use of the system. This use case isn't far-fetched or unusual; many applications require authorization, customization, etc. based on Caller ID. Similar use cases of multitasking in CCXML are easy to find. (If you'd like an example of this particular use case, see the open-source "Voice Conference Manager" at http://vcm.sourceforge.net.) The interesting part of this use case is that the application must manage two different activities. The first activity is to answer the phone and speak the dialog to the user. The second activity is to send an HTTP request to check for authorization. Once these two separate activities complete, the rest of the application logic will proceed. The easiest and most foolproof way to handle each of these two activities is through sub processes. I prefer to send the ANI/Caller ID to a separate CCXML process which interacts with the database; this CCXML script sends HTTP requests, waits for the result, handles error conditions, timeouts, re-requests, etc. and sends the ultimate result back to the main CCXML script. At the same time, the announcement to the user is handled via a VoiceXML script, and when that script completes the result (event) is returned to the main CCXML script. Here's the problem: CCXML does not provide any facility to manage these two simultaneous subprocesses. CCXML is an event-driven state machine, and it only tracks one state at a time. However, the application has two states to track: the state of the database lookup (did it return? what result did it return?) and the state of the VoiceXML dialog (did it return?). And, of course the original state, namely the state of the overall main script. I solve this problem by using a state ("WAIT_FOR_ANSWERS") that has its own private state variables ("DB_LOOKUP_STATE," "ANNOUNCE_STATE"). Instead of using CCXML's facilities to manage my state variable, I have to build and manage my own state information within the context of CCXML: When the VoiceXML dialog finishes, I have to check to see if the authorization lookup has finished; when the authorization lookup finishes, I have to check if the VoiceXML dialog has finished. Please note that if there's a significant gap between the end of dialog and the end of authorization, I might even have process a timeout and insert a "one moment please" dialog for the user, a further complication. In my experience, private state management is both error prone and difficult. Most of all, it's annoying -- after all, CCXML provides perfectly good state management! In my opinion, CCXML-based applications are not really state machines: They are actually Petri nets. Unlike the assumptions of the current CCXML model, it's not always a single event that drives the state transition; several events may need to occur to transition to the next state (such as this use case, where the application has to wait for the end of dialog and the result of authorization). Here's a possible incremental change to the CCXML state management system. CCXML would continue to manage just one "main" state variable but would allow more complex event conditions, as seen in Petri nets. One way CCXML can incorporate Petri-net capabilities is by adding logic operators to the transition element's "event" attribute. For example, "a + b" would mean "pend until both event A and event B have been received." We'd also need a way to express the idea of "pend until A and B both arrive, but as long as no other event is received for this specific state." And certainly there are other formulations ("A or B", "A and B but not C"). An alternative non-Petri net solution would be to designate multiple variables as state variables -- more than one state variable could exist in any given script. Transitions would be extended to include combinations of state variables ("state variable X is A and state variable Y is B while state variable Z is not C"). I don't know if that's workable or desirable, or what mathematical model would represent such a system. I don't pretend to be an expert on Petri nets, but I've found the basic Petri net concept useful before in speech work. I am not insisting on a full-blown implementation of a theoretical Petri net; I just want the parts that are relevant for real-world CCXML applications. And I should probably point out thatcolored Petri nets may provide the best overall solution. Finally, a few words on SCXML. I haven't used SCXML in an application, of course, but as far as I can tell it does offer advantages over CCXML, in particular for managing some forms of parallel processes. I don't know if SCXML is intended to handle the "A + B" logical combination of events the way Petri nets would. Some issues left over from the state-machine model of CCXML that haven't been addressed by SCXML. For example, the SCXML "cancel" element (6.2.2), due to the asynchronous nature of events, can easily generate "error.notallowed," but because events and state changes are asynchronous there's no simple way to know in advance when -- in what state -- the script will receive this error.notallowed. As a result the script will receive error.notallowed (probably in some catch-all transition) and cannot determine whether to discard this particular "error.notallowed" as irrelevant or instead to panic and exit. If the cancel event could be "colored" (or perhaps "vectored" or "tagged" is the right description) and sent only to the necessary states or automatically discarded by some states -- and all that handled by the interpreter instead of custom-coded into the script -- scripts would be far easier to write. With easier error handling, scripts would become aware of errors, and scripts would therefore probably become more reliable as well. In other words, even with SCXML, the notion of colors from Petri nets may provide a conceptual framework for a solution. -- Moshe Yudkowsky Disaggregate 2952 W Fargo Chicago, IL 60645 USA Work: www.Disaggregate.com Book: www.PebbleAndAvalanche.com speech@pobox.com +1 773 764 8727
Received on Friday, 23 June 2006 04:49:20 UTC