RE: [v3] Some v3 functionality suggestions and scenarios from McGlashan, Scott on 2006-08-08 (www-voice@w3.org from July to September 2006)

From: McGlashan, Scott <scott.mcglashan@hp.com>
Date: Tue, 8 Aug 2006 16:04:04 +0200
To: "Vibhu Garg" <vibhugarg@gmail.com>, "Shane Smith" <safarishane@gmail.com>
Cc: "Skip Cave" <Skip.Cave@intervoice.com>, <www-voice@w3.org>
Message-ID: <990C7D7B4096DC49B7BDB5D4F56E5B9F02F64A55@sooexc02.emea.cpqcorp.net>
No, only the next item in the current form.
 

5.3.7 GOTO 

The name of the next form item to visit in the current form. 
 
 
Scott

 
________________________________

From: www-voice-request@w3.org [mailto:www-voice-request@w3.org] On
Behalf Of Vibhu Garg
Sent: Thursday, August 03, 2006 07:31
To: Shane Smith
Cc: Skip Cave; www-voice@w3.org
Subject: Re: [v3] Some v3 functionality suggestions and scenarios


Does goto tag in VXML, allows the transitioin to an Item of another form
in the same document using nextitem attribute?
 
Thanks in advance for speedy response.
 
Regards
Vibhu

 
On 8/3/06, Shane Smith <safarishane@gmail.com> wrote: 

	Skip,
	
	
	>>1- Grammars that stop the dialog thread, and return a semantic
tag to affect the dialog >>flow 
	
	done and done
	
	
	>>2- Grammars that do NOT affect the dialog flow at all, but
produce asynchronous events >>to be handled by CCXML/scXML
	
	Using marktime, this could be accomplished by setting marktime
upon an utterance, performing actions on the client side, and then
jumping back into your prompt using your marktime as a reference.  With
bargeintype set to hotword, I imagine this would be seamless to the
caller. 
	
	
	>>3- Grammars that don't return semantic tags, but instead
affect local parameters such >>as playback speed, loudness, audio file
position, etc. 
	
	Same, using marktime, though my guess would be a round trip to
the server.  I can really see using marktime becoming ugly if we were to
request audio volume changes and needed to handle that on the server for
the upcoming http fetch of the audio file.  Possible, but ugly. 
	
	If these changes are implemented in 3, from an IVR perspective I
would still want to potentially provide an audio cue that the grammar
was accepted and action taken.  Conversely, we would also potentially
need an earcon to let the caller know they nomatched on their last
spoken utterance.  Both of these audio cues would need to be played on
top of the current audio stream playback, assuming these work similar to
the bargeintype=hotword support today.  Does v3 support combining audio
streams?  Would we be able to do this without stopping the stream
playback as you suggest?  Otherwise, I'd end up using marktime to
implement client side browser functionality on the server to work around
those limitations v3 is supposed to address.  
	
	
	> As far as I can tell, there is no way for CCXML to gracefully
stop a running
	> VXML script without killing the browser, let alone suspend it,
with the
	> resume state context saved automatically. And of course, there
is no current 
	> way for CCXML to tell a VXML browser to resume a certain state
after it has
	> been suspended. 
	
	I see your point.  It could be argued that this functionality
belongs in the application scope, simply causing the next fetch to spit
out vxml that would make it seem as if we picked up right where we left
off.  That leaves out client side events though, with ccxml trying to
tell vxml it's time to pause. 
	
	Cool, good info, thanks...
	
	-Shane Smith
	
	
	
	On 8/2/06, Skip Cave <Skip.Cave@intervoice.com > wrote:
	>  
	>  
	>  
	> 
	> Shane, 
	> 
	>   
	> 
	> My comments are interspersed with yours. 
	> 
	>   
	>  
	>  ________________________________ 
	>  
	> 
	> From: Shane Smith 
	> 
	> Sent: Wednesday, August 02, 2006 3:22 PM 
	>  To: Skip Cave
	>  Subject: Re: [v3] Some v3 functionality suggestions and
scenarios 
	>  
	> 
	>   
	> 
	> Hello Skip, 
	>  
	>  Interesting read... had a couple of clarifications if you
don't mind.  Are 
	> there any scenarios you envision that couldn't be handled with
CCXML?  
	>   
	> 
	> [SC] As far as I can tell, NONE of my scenarios could be
implemented in
	> CCXML, though it is more a problem with VXML than CCXML. Take
the scenario 
	> where the VXML script is playing a long voicemail message & an
external 
	> asynchronous event occurs (presumably detected by CCXML or
scXML). The
	> external event could be a task completion, an inbound call, a
stock-sell 
	> threshold reached, whatever) There currently isn't any way for
CCXML to 
	> suspend the current active VXML script, save the VXML script
context, and
	> pause the voicemail play, to make way for the user to handle
the new event. 
	> We need a way for CCXML to suspend and resume VXML scripts
without losing 
	> context. 
	> 
	> Assuming the active VXML script could be suspended, then the
application
	> needs to let the user deal with the issue - acknowledge the
task completion, 
	> handle the call, interact with a different VXML script to deal
with the 
	> stock sale, etc. This means that the CCXML/scXML process may
need to start
	> up a second VXML script to let the user deal with the
asynchronous 
	> concurrent task, leaving the original script suspended on the
context stack. 
	> After dealing with the issue, we want to have CCXML tell the
VXML browser to
	> resume back where it left off, popping the context stack, and
continuing 
	> where it left off originally, playing the long voicemail
message in the 
	> voicemail VXML script. 
	> 
	> As far as I can tell, there is no way for CCXML to gracefully
stop a running
	> VXML script without killing the browser, let alone suspend it,
with the 
	> resume state context saved automatically. And of course, there
is no current 
	> way for CCXML to tell a VXML browser to resume a certain state
after it has
	> been suspended. 
	> 
	> Another limitation with current VXML, is the capability to
allow a user to 
	> spawn events or commands during a play or recognize dialog
state, without 
	> killing the ongoing dialog. For example - as before, a user is
listening to
	> his long voicemail message. In the middle of the message from
Joe, the user 
	> decides he wants to call Joe (or send Joe an email, etc.). The
user says 
	> "Call Joe" or Email Joe to call me", or some other command,
and continues
	> listening to Joe's message. The system should take the command
"Call Joe", 
	> spawn a concurrent process to call Joe or send him an email,
but keep on 
	> playing Joe's voicemail message without stopping. This scheme
is currently
	> impossible in VXML today. Again it's not CCXML's problem, its
VXML's 
	> problem. 
	> 
	> A similar issue is when the user is listening to the long
voicemail from 
	> Joe, and he says commands like "back up 10 seconds" or, "skip
to the last 20
	> seconds", or 'louder' or "play faster", or "slow down". All of
these 
	> commands should affect the playback of the voicemail message,
but not stop 
	> the playback. Currently, VXML doesn't do this. As a general
rule there needs
	> to be three different types of grammars in VXML 
	> 
	> 1- Grammars that stop the dialog thread, and return a semantic
tag to affect 
	> the dialog flow 
	> 
	> 2- Grammars that do NOT affect the dialog flow at all, but
produce
	> asynchronous events to be handled by CCXML/scXML 
	> 
	> 3- Grammars that don't return semantic tags, but instead
affect local 
	> parameters such as playback speed, loudness, audio file
position, etc. 
	> 
	> Now all of this is really a limitation on VXML, not CCXML,
which is why my 
	> message title was prefaced [v3] and not [CCXML]. 
	> 
	> Looking at the VXML 3.0 spec at
	> http://www.w3.org/Voice/Group/2005/V3/, it is clear that it 
	> is planning to have more asynchronous capabilities than VXML
2.1 
	> 
	>   
	> 
	> Under section 1.2.2.3 <http://1.2.2.3/>  of the VXML 3 spec it
says: 
	> 
	> More advanced interaction with the presentation is possible in
the (VXML 3) 
	> DFP framework than is currently permitted with VoiceXML
2.0/2.1.
	> Consequently, VoiceXML 3.0 may be enhanced with capabilities
such as: 
	>  
	> VoiceXML dialogs are cancellable 
	> VoiceXML dialogs can receive events from the flow layer during
execution. 
	> These events are exposed in the presentation markup. 
	> VoiceXML dialogs can send events to the flow layer during
execution. These
	> events are specified in the presentation markup. 
	> 
	> This is a good start, but the suspend scenario I described is
not covered in 
	> the statement of new capabilities. One thing missing from this
is the
	> capability to save the dialog state, and return back later.
There needs to
	> be a "suspend/resume" command besides the standard "start
dialog" command 
	> from CCXML.  Hopefully this functionality will get added as
the spec
	> matures. 
	> 
	> Second, 3.0 needs the capability to accept user commands
(touch tone, voice,
	> pen, whatever) during a play or recognize state, without
stopping the play 
	> or recognize state. These asynchronous commands should be able
to send
	> events to CCXML/scXML without affecting the dialog thread. Or,
the command
	> could affect how the current media is being handled, or other
local effects 
	> such as "record the remainder of this call" or "mute Joe on
this conference"
	>  
	> 
	> As an ivr designer, I've used vxml primarily to drive the
call, using it as
	> simply as possible, just like any other protocol.  I never
really 'write' 
	> vxml apps, I write web apps that shoot out vxml instead of
html.  First
	> cardinal sin on any application under my direction is the
introduction of
	> client side logic.  Though I've been working this way for
years, I've seen a 
	> tendency at several client sites to try and write a client
side application,
	> instead of handling all logic on the server side.  Time to
implement, debug,
	> maintain, and test are all shorter when using existing web
application test 
	> suites.  (currently project uses canoo, ugly but works)  Every
bit of logic
	> can be functionally tested separate from the vui (kinda like
mvc) and only
	> when everything works do we pick up the phone for a real test
call.  I would 
	> hate to see the vxml spec evolve to where it required more
logic on the
	> client vxml browser than is necessary.  All the logic gates
available today
	> in vxml are generally shunned. 
	>  
	> 
	> [SC] I agree totally with you. Server side is the way to go.
However with
	> VXML, today's CCXML server currently doesn't have enough
control over the
	> script execution. VXML 3.0 should try and fix these issues.
It's not a 
	> problem with CCXML. 
	> 
	>  
	>  If you're familiar with osd/osdm or apache rdc's, what we do
is similar but
	> with all event handling done by java, and nothing but a simple
javascript
	> function to encode all data to be passed back to the server in
a single 
	> variable.  Is vxml3 still going to be accommodating to develop
in this
	> fashion? 
	> 
	> 
	>  
	> 
	> [SC] Something like what you suggest is feasible, but keep in
mind that
	> asynchronous events will be happening on both the server side
and on the 
	> client side, at any time. Both entities (server & client) must
need to be
	> able to handle these events. Whatever mechanism is finally
used, must
	> efficiently deal with this fact. 
	> 
	> Regards, 
	>  Shane Smith 
	>  
	>  This e-mail transmission may contain information that is
proprietary,
	> privileged and/or confidential and is intended exclusively for
the person(s)
	> to whom it is addressed. Any use, copying, retention or
disclosure by any 
	> person other than the intended recipient or the intended
recipient's
	> designees is strictly prohibited. If you are the intended
recipient, you
	> must treat the information in confidence and in accordance
with all laws 
	> related to the privacy and confidentiality of such
information. If you are
	> not the intended recipient or their designee, please notify
the sender
	> immediately by return e-mail and delete all copies of this
email, including 
	> all attachments.
	>  
	>
Received on Tuesday, 8 August 2006 14:04:21 UTC