Re: [v3] Some v3 functionality suggestions and scenarios from Vibhu Garg on 2006-08-03 (www-voice@w3.org from July to September 2006)

From: Vibhu Garg <vibhugarg@gmail.com>
Date: Thu, 3 Aug 2006 11:00:57 +0530
To: "Shane Smith" <safarishane@gmail.com>
Cc: "Skip Cave" <Skip.Cave@intervoice.com>, www-voice@w3.org
Message-ID: <5027501c0608022230s7e225c9cv9abe7a9a82538fbd@mail.gmail.com>
Does goto tag in VXML, allows the transitioin to an Item of another form in
the same document using nextitem attribute?

Thanks in advance for speedy response.

Regards
Vibhu


On 8/3/06, Shane Smith <safarishane@gmail.com> wrote:
>
> Skip,
>
>
> >>1- Grammars that stop the dialog thread, and return a semantic tag to
> affect the dialog >>flow
> done and done
>
>
> >>2- Grammars that do NOT affect the dialog flow at all, but produce
> asynchronous events >>to be handled by CCXML/scXML
> Using marktime, this could be accomplished by setting marktime upon an
> utterance, performing actions on the client side, and then jumping back into
> your prompt using your marktime as a reference.  With bargeintype set to
> hotword, I imagine this would be seamless to the caller.
>
>
> >>3- Grammars that don't return semantic tags, but instead affect local
> parameters such >>as playback speed, loudness, audio file position, etc.
> Same, using marktime, though my guess would be a round trip to the
> server.  I can really see using marktime becoming ugly if we were to request
> audio volume changes and needed to handle that on the server for the
> upcoming http fetch of the audio file.  Possible, but ugly.
>
> If these changes are implemented in 3, from an IVR perspective I would
> still want to potentially provide an audio cue that the grammar was
> accepted and action taken.  Conversely, we would also potentially need an
> earcon to let the caller know they nomatched on their last spoken
> utterance.  Both of these audio cues would need to be played on top of the
> current audio stream playback, assuming these work similar to the
> bargeintype=hotword support today.  Does v3 support combining audio
> streams?  Would we be able to do this without stopping the stream playback
> as you suggest?  Otherwise, I'd end up using marktime to implement client
> side browser functionality on the server to work around those limitations v3
> is supposed to address.
>
> > As far as I can tell, there is no way for CCXML to gracefully stop a
> running
> > VXML script without killing the browser, let alone suspend it, with the
> > resume state context saved automatically. And of course, there is no
> current
> > way for CCXML to tell a VXML browser to resume a certain state after it
> has
> > been suspended.
> I see your point.  It could be argued that this functionality belongs in
> the application scope, simply causing the next fetch to spit out vxml that
> would make it seem as if we picked up right where we left off.  That leaves
> out client side events though, with ccxml trying to tell vxml it's time to
> pause.
>
> Cool, good info, thanks...
> -Shane Smith
>
>
>
> On 8/2/06, Skip Cave <Skip.Cave@intervoice.com> wrote:
> >
> >
> >
> >
> > Shane,
> >
> >
> >
> > My comments are interspersed with yours.
> >
> >
> >
> >  ________________________________
> >
> >
> > From: Shane Smith
> >
> > Sent: Wednesday, August 02, 2006 3:22 PM
> >  To: Skip Cave
> >  Subject: Re: [v3] Some v3 functionality suggestions and scenarios
> >
> >
> >
> >
> > Hello Skip,
> >
> >  Interesting read... had a couple of clarifications if you don't
> mind.  Are
> > there any scenarios you envision that couldn't be handled with CCXML?
> >
> >
> > [SC] As far as I can tell, NONE of my scenarios could be implemented in
> > CCXML, though it is more a problem with VXML than CCXML. Take the
> scenario
> > where the VXML script is playing a long voicemail message & an external
> > asynchronous event occurs (presumably detected by CCXML or scXML). The
> > external event could be a task completion, an inbound call, a stock-sell
>
> > threshold reached, whatever) There currently isn't any way for CCXML to
> > suspend the current active VXML script, save the VXML script context,
> and
> > pause the voicemail play, to make way for the user to handle the new
> event.
> > We need a way for CCXML to suspend and resume VXML scripts without
> losing
> > context.
> >
> > Assuming the active VXML script could be suspended, then the application
> > needs to let the user deal with the issue - acknowledge the task
> completion,
> > handle the call, interact with a different VXML script to deal with the
> > stock sale, etc. This means that the CCXML/scXML process may need to
> start
> > up a second VXML script to let the user deal with the asynchronous
> > concurrent task, leaving the original script suspended on the context
> stack.
> > After dealing with the issue, we want to have CCXML tell the VXML
> browser to
> > resume back where it left off, popping the context stack, and continuing
>
> > where it left off originally, playing the long voicemail message in the
> > voicemail VXML script.
> >
> > As far as I can tell, there is no way for CCXML to gracefully stop a
> running
> > VXML script without killing the browser, let alone suspend it, with the
> > resume state context saved automatically. And of course, there is no
> current
> > way for CCXML to tell a VXML browser to resume a certain state after it
> has
> > been suspended.
> >
> > Another limitation with current VXML, is the capability to allow a user
> to
> > spawn events or commands during a play or recognize dialog state,
> without
> > killing the ongoing dialog. For example - as before, a user is listening
> to
> > his long voicemail message. In the middle of the message from Joe, the
> user
> > decides he wants to call Joe (or send Joe an email, etc.). The user says
> > "Call Joe" or Email Joe to call me", or some other command, and
> continues
> > listening to Joe's message. The system should take the command "Call
> Joe",
> > spawn a concurrent process to call Joe or send him an email, but keep on
> > playing Joe's voicemail message without stopping. This scheme is
> currently
> > impossible in VXML today. Again it's not CCXML's problem, its VXML's
> > problem.
> >
> > A similar issue is when the user is listening to the long voicemail from
> > Joe, and he says commands like "back up 10 seconds" or, "skip to the
> last 20
> > seconds", or 'louder' or "play faster", or "slow down". All of these
> > commands should affect the playback of the voicemail message, but not
> stop
> > the playback. Currently, VXML doesn't do this. As a general rule there
> needs
> > to be three different types of grammars in VXML
> >
> > 1- Grammars that stop the dialog thread, and return a semantic tag to
> affect
> > the dialog flow
> >
> > 2- Grammars that do NOT affect the dialog flow at all, but produce
> > asynchronous events to be handled by CCXML/scXML
> >
> > 3- Grammars that don't return semantic tags, but instead affect local
> > parameters such as playback speed, loudness, audio file position, etc.
> >
> > Now all of this is really a limitation on VXML, not CCXML, which is why
> my
> > message title was prefaced [v3] and not [CCXML].
> >
> > Looking at the VXML 3.0 spec at
> > http://www.w3.org/Voice/Group/2005/V3/, it is clear that it
> > is planning to have more asynchronous capabilities than VXML 2.1
> >
> >
> >
> > Under section 1.2.2.3 of the VXML 3 spec it says:
> >
> > More advanced interaction with the presentation is possible in the (VXML
> 3)
> > DFP framework than is currently permitted with VoiceXML 2.0/2.1.
> > Consequently, VoiceXML 3.0 may be enhanced with capabilities such as:
> >
> > VoiceXML dialogs are cancellable
> > VoiceXML dialogs can receive events from the flow layer during
> execution.
> > These events are exposed in the presentation markup.
> > VoiceXML dialogs can send events to the flow layer during execution.
> These
> > events are specified in the presentation markup.
> >
> > This is a good start, but the suspend scenario I described is not
> covered in
> > the statement of new capabilities. One thing missing from this is the
> > capability to save the dialog state, and return back later. There needs
> to
> > be a "suspend/resume" command besides the standard "start dialog"
> command
> > from CCXML.  Hopefully this functionality will get added as the spec
> > matures.
> >
> > Second, 3.0 needs the capability to accept user commands (touch tone,
> voice,
> > pen, whatever) during a play or recognize state, without stopping the
> play
> > or recognize state. These asynchronous commands should be able to send
> > events to CCXML/scXML without affecting the dialog thread. Or, the
> command
> > could affect how the current media is being handled, or other local
> effects
> > such as "record the remainder of this call" or "mute Joe on this
> conference"
> >
> >
> > As an ivr designer, I've used vxml primarily to drive the call, using it
> as
> > simply as possible, just like any other protocol.  I never really
> 'write'
> > vxml apps, I write web apps that shoot out vxml instead of html.  First
> > cardinal sin on any application under my direction is the introduction
> of
> > client side logic.  Though I've been working this way for years, I've
> seen a
> > tendency at several client sites to try and write a client side
> application,
> > instead of handling all logic on the server side.  Time to implement,
> debug,
> > maintain, and test are all shorter when using existing web application
> test
> > suites.  (currently project uses canoo, ugly but works)  Every bit of
> logic
> > can be functionally tested separate from the vui (kinda like mvc) and
> only
> > when everything works do we pick up the phone for a real test call.  I
> would
> > hate to see the vxml spec evolve to where it required more logic on the
> > client vxml browser than is necessary.  All the logic gates available
> today
> > in vxml are generally shunned.
> >
> >
> > [SC] I agree totally with you. Server side is the way to go. However
> with
> > VXML, today's CCXML server currently doesn't have enough control over
> the
> > script execution. VXML 3.0 should try and fix these issues. It's not a
> > problem with CCXML.
> >
> >
> >  If you're familiar with osd/osdm or apache rdc's, what we do is similar
> but
> > with all event handling done by java, and nothing but a simple
> javascript
> > function to encode all data to be passed back to the server in a single
> > variable.  Is vxml3 still going to be accommodating to develop in this
> > fashion?
> >
> >
> >
> >
> > [SC] Something like what you suggest is feasible, but keep in mind that
> > asynchronous events will be happening on both the server side and on the
>
> > client side, at any time. Both entities (server & client) must need to
> be
> > able to handle these events. Whatever mechanism is finally used, must
> > efficiently deal with this fact.
> >
> > Regards,
> >  Shane Smith
> >
> >  This e-mail transmission may contain information that is proprietary,
> > privileged and/or confidential and is intended exclusively for the
> person(s)
> > to whom it is addressed. Any use, copying, retention or disclosure by
> any
> > person other than the intended recipient or the intended recipient's
> > designees is strictly prohibited. If you are the intended recipient, you
> > must treat the information in confidence and in accordance with all laws
>
> > related to the privacy and confidentiality of such information. If you
> are
> > not the intended recipient or their designee, please notify the sender
> > immediately by return e-mail and delete all copies of this email,
> including
> > all attachments.
> >
> >
>
Received on Thursday, 3 August 2006 05:34:01 UTC