W3C home > Mailing lists > Public > www-voice@w3.org > April to June 2006

RE: Enhanced Model for CCXML State Management

From: Ken Rehor <krehor@vocalocity.net>
Date: Fri, 23 Jun 2006 10:35:50 -0400
Message-ID: <92E86BBD06161E4299A56009516472EDD7D92E@gates.vcorp.vocalocity.net>
To: <jim@larson-tech.com>, "Moshe Yudkowsky" <speech@pobox.com>
Cc: <www-voice@w3.org>

1) ANI and other data about the far-end is not necessarily reliable in all cases, but it can very useful for many circumstances. It may be used for routing or other purposes other than trying to identify a specific caller.

2) Moshe's example of checking ANI is only an example here.  His basic premise applies.


-----Original Message-----
From: www-voice-request@w3.org on behalf of jim@larson-tech.com
Sent: Fri 6/23/2006 9:16 AM
To: Moshe Yudkowsky
Cc: www-voice@w3.org
Subject: Re: Enhanced Model for CCXML State Management


Check out http://www.w3.org/TR/2006/WD-scxml-20060124/, our new state
chart langage.  It contains a "parallel" concept that could handle your
two parallel activities.  Also, Ken Rehor and Judith Markowitz are working
on a Speaker Identification and Verification that will include a variety
of techniques for verifying who the caller is.  While I personally think
that using ANI is a dumb way to identify the caller (someone else uses
your phone and pretends to be you), the forthcoming SIV techniques could
be used within VoiceXML and/or CCXML.



> Enhanced Model for CCXML State Management
> My name is Moshe Yudkowsky; among other speech-related work, I develop
> VoiceXML and CCXML applications. I have a proposal to make that will
> improve how CCXML manages state variables. In particular, I will propose a
> radical -- yet incremental -- change to the conceptual model of CCXML
> state
> management.
> Here's a very simple use case: a user calls into an application. The
> application then does two things simultaneously:
> (a) the application answers the phone to say "Welcome to the Acme Call
> System."
> (b) the application uses ANI/caller ID to check the identity of the caller
> and see if the caller is authorized for use of the system.
> This use case isn't far-fetched or unusual; many applications require
> authorization, customization, etc. based on Caller ID. Similar use cases
> of
> multitasking in CCXML are easy to find.  (If you'd like an example of this
> particular use case, see the open-source "Voice Conference Manager" at
> http://vcm.sourceforge.net.)
> The interesting part of this use case is that the application must manage
> two different activities. The first activity is to answer the phone and
> speak the dialog to the user. The second activity is to send an HTTP
> request to check for authorization. Once these two separate activities
> complete, the rest of the application logic will proceed.
> The easiest and most foolproof way to handle each of these two activities
> is through sub processes. I prefer to send the ANI/Caller ID to a separate
> CCXML process which interacts with the database; this CCXML script sends
> HTTP requests, waits for the result, handles error conditions, timeouts,
> re-requests, etc. and sends the ultimate result back to the main CCXML
> script. At the same time, the announcement to the user is handled via a
> VoiceXML script, and when that script completes the result (event) is
> returned to the main CCXML script.
> Here's the problem: CCXML does not provide any facility to manage these
> two
> simultaneous subprocesses. CCXML is an event-driven state machine, and it
> only tracks one state at a time. However, the application has two states
> to
> track: the state of the database lookup (did it return? what result did it
> return?) and the state of the VoiceXML dialog (did it return?). And, of
> course the original state, namely the state of the overall main script.
> I solve this problem by using a state ("WAIT_FOR_ANSWERS") that has its
> own
> private state variables ("DB_LOOKUP_STATE," "ANNOUNCE_STATE"). Instead of
> using CCXML's facilities to manage my state variable, I have to build and
> manage my own state information within the context of CCXML: When the
> VoiceXML dialog finishes, I have to check to see if the authorization
> lookup has finished; when the authorization lookup finishes, I have to
> check if the VoiceXML dialog has finished. Please note that if there's a
> significant gap between the end of dialog and the end of authorization, I
> might even have process a timeout and insert a "one moment please" dialog
> for the user, a further complication.
> In my experience, private state management is both error prone and
> difficult. Most of all, it's annoying -- after all, CCXML provides
> perfectly good state management!
> In my opinion, CCXML-based applications are not really state machines:
> They
> are actually Petri nets. Unlike the assumptions of the current CCXML
> model,
> it's not always a single event that drives the state transition; several
> events may need to occur to transition to the next state (such as this use
> case, where the application has to wait for the end of dialog and the
> result of authorization).
> Here's a possible incremental change to the CCXML state management system.
> CCXML would continue to manage just one "main" state variable but would
> allow more complex event conditions, as seen in Petri nets. One way CCXML
> can incorporate Petri-net capabilities is by adding logic operators to the
> transition element's "event" attribute. For example, "a + b" would mean
> "pend until both event A and event B have been received." We'd also need a
> way to express the idea of "pend until A and B both arrive, but as long as
> no other  event is received for this specific state." And certainly there
> are other formulations ("A or B", "A and B but not C").
> An alternative non-Petri net solution would be to designate multiple
> variables as state variables -- more than one state variable could exist
> in
> any given script. Transitions would be extended to include combinations of
> state variables ("state variable X is A and state variable Y is B while
> state variable Z is not C"). I don't know if that's workable or desirable,
> or what mathematical model would represent such a system.
> I don't pretend to be an expert on Petri nets, but I've found the basic
> Petri net concept useful before in speech work. I am not insisting on a
> full-blown implementation of a theoretical Petri net; I just want the
> parts
> that are relevant for real-world CCXML applications. And I should probably
> point out thatcolored Petri nets may provide the best overall solution.
> Finally, a few words on SCXML. I haven't used SCXML in an application, of
> course, but as far as I can tell it does offer advantages over CCXML, in
> particular for managing some forms of parallel processes. I don't know if
> SCXML is intended to handle the "A + B"  logical combination of events the
> way Petri nets would.
> Some issues left over from the state-machine model of CCXML that haven't
> been addressed by SCXML. For example, the SCXML "cancel" element (6.2.2),
> due to the asynchronous nature of events, can easily generate
> "error.notallowed," but because events and state changes are asynchronous
> there's no simple way to know in advance when -- in what state -- the
> script will receive this error.notallowed. As a result the script will
> receive error.notallowed (probably in some catch-all transition) and
> cannot
> determine whether to discard this particular "error.notallowed" as
> irrelevant or instead to panic and exit. If the cancel event could be
> "colored" (or perhaps "vectored" or "tagged" is the right description) and
> sent only to the necessary states or automatically discarded by some
> states
> -- and all that handled by the interpreter instead of custom-coded into
> the
> script -- scripts would be far easier to write. With easier error
> handling,
> scripts would become aware of errors, and scripts would therefore probably
> become more reliable as well. In other words, even with SCXML, the notion
> of colors from Petri nets may provide a conceptual framework for a
> solution.
> --
>   Moshe Yudkowsky
>   Disaggregate
>   2952 W Fargo
>   Chicago, IL 60645 USA
>   Work: www.Disaggregate.com
>   Book: www.PebbleAndAvalanche.com
>   speech@pobox.com
>   +1 773 764 8727
Received on Friday, 23 June 2006 14:40:41 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:03:52 UTC