W3C home > Mailing lists > Public > www-voice@w3.org > July to September 2006

Re: [v3] Some v3 functionality suggestions and scenarios

From: Vibhu Garg <vibhugarg@gmail.com>
Date: Thu, 3 Aug 2006 11:00:57 +0530
Message-ID: <5027501c0608022230s7e225c9cv9abe7a9a82538fbd@mail.gmail.com>
To: "Shane Smith" <safarishane@gmail.com>
Cc: "Skip Cave" <Skip.Cave@intervoice.com>, www-voice@w3.org
Does goto tag in VXML, allows the transitioin to an Item of another form in
the same document using nextitem attribute?

Thanks in advance for speedy response.

Regards
Vibhu


On 8/3/06, Shane Smith <safarishane@gmail.com> wrote:
>
> Skip,
>
>
> >>1- Grammars that stop the dialog thread, and return a semantic tag to
> affect the dialog >>flow
> done and done
>
>
> >>2- Grammars that do NOT affect the dialog flow at all, but produce
> asynchronous events >>to be handled by CCXML/scXML
> Using marktime, this could be accomplished by setting marktime upon an
> utterance, performing actions on the client side, and then jumping back into
> your prompt using your marktime as a reference.  With bargeintype set to
> hotword, I imagine this would be seamless to the caller.
>
>
> >>3- Grammars that don't return semantic tags, but instead affect local
> parameters such >>as playback speed, loudness, audio file position, etc.
> Same, using marktime, though my guess would be a round trip to the
> server.  I can really see using marktime becoming ugly if we were to request
> audio volume changes and needed to handle that on the server for the
> upcoming http fetch of the audio file.  Possible, but ugly.
>
> If these changes are implemented in 3, from an IVR perspective I would
> still want to potentially provide an audio cue that the grammar was
> accepted and action taken.  Conversely, we would also potentially need an
> earcon to let the caller know they nomatched on their last spoken
> utterance.  Both of these audio cues would need to be played on top of the
> current audio stream playback, assuming these work similar to the
> bargeintype=hotword support today.  Does v3 support combining audio
> streams?  Would we be able to do this without stopping the stream playback
> as you suggest?  Otherwise, I'd end up using marktime to implement client
> side browser functionality on the server to work around those limitations v3
> is supposed to address.
>
> > As far as I can tell, there is no way for CCXML to gracefully stop a
> running
> > VXML script without killing the browser, let alone suspend it, with the
> > resume state context saved automatically. And of course, there is no
> current
> > way for CCXML to tell a VXML browser to resume a certain state after it
> has
> > been suspended.
> I see your point.  It could be argued that this functionality belongs in
> the application scope, simply causing the next fetch to spit out vxml that
> would make it seem as if we picked up right where we left off.  That leaves
> out client side events though, with ccxml trying to tell vxml it's time to
> pause.
>
> Cool, good info, thanks...
> -Shane Smith
>
>
>
> On 8/2/06, Skip Cave <Skip.Cave@intervoice.com> wrote:
> >
> >
> >
> >
> > Shane,
> >
> >
> >
> > My comments are interspersed with yours.
> >
> >
> >
> >  ________________________________
> >
> >
> > From: Shane Smith
> >
> > Sent: Wednesday, August 02, 2006 3:22 PM
> >  To: Skip Cave
> >  Subject: Re: [v3] Some v3 functionality suggestions and scenarios
> >
> >
> >
> >
> > Hello Skip,
> >
> >  Interesting read... had a couple of clarifications if you don't
> mind.  Are
> > there any scenarios you envision that couldn't be handled with CCXML?
> >
> >
> > [SC] As far as I can tell, NONE of my scenarios could be implemented in
> > CCXML, though it is more a problem with VXML than CCXML. Take the
> scenario
> > where the VXML script is playing a long voicemail message & an external
> > asynchronous event occurs (presumably detected by CCXML or scXML). The
> > external event could be a task completion, an inbound call, a stock-sell
>
> > threshold reached, whatever) There currently isn't any way for CCXML to
> > suspend the current active VXML script, save the VXML script context,
> and
> > pause the voicemail play, to make way for the user to handle the new
> event.
> > We need a way for CCXML to suspend and resume VXML scripts without
> losing
> > context.
> >
> > Assuming the active VXML script could be suspended, then the application
> > needs to let the user deal with the issue - acknowledge the task
> completion,
> > handle the call, interact with a different VXML script to deal with the
> > stock sale, etc. This means that the CCXML/scXML process may need to
> start
> > up a second VXML script to let the user deal with the asynchronous
> > concurrent task, leaving the original script suspended on the context
> stack.
> > After dealing with the issue, we want to have CCXML tell the VXML
> browser to
> > resume back where it left off, popping the context stack, and continuing
>
> > where it left off originally, playing the long voicemail message in the
> > voicemail VXML script.
> >
> > As far as I can tell, there is no way for CCXML to gracefully stop a
> running
> > VXML script without killing the browser, let alone suspend it, with the
> > resume state context saved automatically. And of course, there is no
> current
> > way for CCXML to tell a VXML browser to resume a certain state after it
> has
> > been suspended.
> >
> > Another limitation with current VXML, is the capability to allow a user
> to
> > spawn events or commands during a play or recognize dialog state,
> without
> > killing the ongoing dialog. For example - as before, a user is listening
> to
> > his long voicemail message. In the middle of the message from Joe, the
> user
> > decides he wants to call Joe (or send Joe an email, etc.). The user says
> > "Call Joe" or Email Joe to call me", or some other command, and
> continues
> > listening to Joe's message. The system should take the command "Call
> Joe",
> > spawn a concurrent process to call Joe or send him an email, but keep on
> > playing Joe's voicemail message without stopping. This scheme is
> currently
> > impossible in VXML today. Again it's not CCXML's problem, its VXML's
> > problem.
> >
> > A similar issue is when the user is listening to the long voicemail from
> > Joe, and he says commands like "back up 10 seconds" or, "skip to the
> last 20
> > seconds", or 'louder' or "play faster", or "slow down". All of these
> > commands should affect the playback of the voicemail message, but not
> stop
> > the playback. Currently, VXML doesn't do this. As a general rule there
> needs
> > to be three different types of grammars in VXML
> >
> > 1- Grammars that stop the dialog thread, and return a semantic tag to
> affect
> > the dialog flow
> >
> > 2- Grammars that do NOT affect the dialog flow at all, but produce
> > asynchronous events to be handled by CCXML/scXML
> >
> > 3- Grammars that don't return semantic tags, but instead affect local
> > parameters such as playback speed, loudness, audio file position, etc.
> >
> > Now all of this is really a limitation on VXML, not CCXML, which is why
> my
> > message title was prefaced [v3] and not [CCXML].
> >
> > Looking at the VXML 3.0 spec at
> > http://www.w3.org/Voice/Group/2005/V3/, it is clear that it
> > is planning to have more asynchronous capabilities than VXML 2.1
> >
> >
> >
> > Under section 1.2.2.3 of the VXML 3 spec it says:
> >
> > More advanced interaction with the presentation is possible in the (VXML
> 3)
> > DFP framework than is currently permitted with VoiceXML 2.0/2.1.
> > Consequently, VoiceXML 3.0 may be enhanced with capabilities such as:
> >
> > VoiceXML dialogs are cancellable
> > VoiceXML dialogs can receive events from the flow layer during
> execution.
> > These events are exposed in the presentation markup.
> > VoiceXML dialogs can send events to the flow layer during execution.
> These
> > events are specified in the presentation markup.
> >
> > This is a good start, but the suspend scenario I described is not
> covered in
> > the statement of new capabilities. One thing missing from this is the
> > capability to save the dialog state, and return back later. There needs
> to
> > be a "suspend/resume" command besides the standard "start dialog"
> command
> > from CCXML.  Hopefully this functionality will get added as the spec
> > matures.
> >
> > Second, 3.0 needs the capability to accept user commands (touch tone,
> voice,
> > pen, whatever) during a play or recognize state, without stopping the
> play
> > or recognize state. These asynchronous commands should be able to send
> > events to CCXML/scXML without affecting the dialog thread. Or, the
> command
> > could affect how the current media is being handled, or other local
> effects
> > such as "record the remainder of this call" or "mute Joe on this
> conference"
> >
> >
> > As an ivr designer, I've used vxml primarily to drive the call, using it
> as
> > simply as possible, just like any other protocol.  I never really
> 'write'
> > vxml apps, I write web apps that shoot out vxml instead of html.  First
> > cardinal sin on any application under my direction is the introduction
> of
> > client side logic.  Though I've been working this way for years, I've
> seen a
> > tendency at several client sites to try and write a client side
> application,
> > instead of handling all logic on the server side.  Time to implement,
> debug,
> > maintain, and test are all shorter when using existing web application
> test
> > suites.  (currently project uses canoo, ugly but works)  Every bit of
> logic
> > can be functionally tested separate from the vui (kinda like mvc) and
> only
> > when everything works do we pick up the phone for a real test call.  I
> would
> > hate to see the vxml spec evolve to where it required more logic on the
> > client vxml browser than is necessary.  All the logic gates available
> today
> > in vxml are generally shunned.
> >
> >
> > [SC] I agree totally with you. Server side is the way to go. However
> with
> > VXML, today's CCXML server currently doesn't have enough control over
> the
> > script execution. VXML 3.0 should try and fix these issues. It's not a
> > problem with CCXML.
> >
> >
> >  If you're familiar with osd/osdm or apache rdc's, what we do is similar
> but
> > with all event handling done by java, and nothing but a simple
> javascript
> > function to encode all data to be passed back to the server in a single
> > variable.  Is vxml3 still going to be accommodating to develop in this
> > fashion?
> >
> >
> >
> >
> > [SC] Something like what you suggest is feasible, but keep in mind that
> > asynchronous events will be happening on both the server side and on the
>
> > client side, at any time. Both entities (server & client) must need to
> be
> > able to handle these events. Whatever mechanism is finally used, must
> > efficiently deal with this fact.
> >
> > Regards,
> >  Shane Smith
> >
> >  This e-mail transmission may contain information that is proprietary,
> > privileged and/or confidential and is intended exclusively for the
> person(s)
> > to whom it is addressed. Any use, copying, retention or disclosure by
> any
> > person other than the intended recipient or the intended recipient's
> > designees is strictly prohibited. If you are the intended recipient, you
> > must treat the information in confidence and in accordance with all laws
>
> > related to the privacy and confidentiality of such information. If you
> are
> > not the intended recipient or their designee, please notify the sender
> > immediately by return e-mail and delete all copies of this email,
> including
> > all attachments.
> >
> >
>
Received on Thursday, 3 August 2006 05:34:01 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 October 2006 12:49:03 GMT