VoiceXML 2.0 october 2001 questions from Brian Wyld on 2002-04-05 (www-voice@w3.org from April to June 2002)

From: Brian Wyld <brian.wyld@eloquant.com>
Date: Fri, 5 Apr 2002 10:01:14 +0200
To: <www-voice@w3.org>
Cc: "Brian Wyld" <brian.wyld@eloquant.com>, "Guillaume Berche" <guillaume.berche@eloquant.com>
Message-ID: <OCENLGFFCDHPOEENHEPFIEEKCJAA.brian.wyld@eloquant.com>
Hi,

I'm trying to understand the VoiceXML 2.0 spec (October2001) with a view
to implementing a VoiceXML browser. This has led me to some questions which
I hope someone here will be able to answer for me.... If they're stupid,
then please forgive me and point me towards the FAQ responses....

Questions:

1/ What is the timescale for the next draft or final version? Is there a
"updates" or FAQ giving the delta with respect to the October 2001 version
(I noted some points in the mailing list that seemed to indicate that some
changes were already in plan)

2/ <vxml> tag - base attribute use.
 - if the attribute is missing, should the current document's URL be used as
a root for relative URLs?
 - if the attribute is missing in the current document, but is present in
the root application document, should this be used in preference to the
current document's URL?

3/ activation of <grammar>, <property> tags
 - is it correct to assume that all <property> and <grammar> tags in a
particular scope are to be taken into account immediately on entry into that
scope? ie, they are not applied in document order?
 - if so, how should multiple <property> assignments to the same property be
handled? Last one in document order becomes the value for that scope?

4/ input received during a prompt with bargein=false
 - is input (DTMF or voice) during a non-bargeinable prompt saved until the
next bargeinable or silent input collection period, or discarded?
 - if playing a mix of bargeinable and non-bargeinable prompts, how should
the FIA handle this? It appears to say that all prompts are queued up for
playing at the start of the form, and input collection starts at this
point - hence implying that input during a non-bargeinable prompt will be
saved and treated at the next possible time.
   -> this can lead to very irritated users, pressing keys N times during a
prompt only to have them taken into account multiple times at the end!
 - what about input presented when the browser is not currently collecting
input? Discarded or saved?
 - when, if at any time, should the input be flushed? Eg if a DTMF input
matches a grammar, but more keys were pressed, is this "typeahead" for the
next form or to be discared?

5/ <prompt> timeout attribute
 - what is the scope of this attribute? The spec appears to indicate that it
sets the no-input timeout for "the next user input sequence". This could be
a long way away from the prompt setting the value.... (eg a prompt followed
by a goto)
 - or should the attribute setting only apply to the current scope? (ie
setting timeout via the attribute is equivalent to setting it in the current
scope via a <property> tag)

6/ <block>
 - Is it correct interpretation that the <block> tag has no input collection
phase ie once all its prompts are terminated it passes to the next element?
 - should the FIA wait till all the prompts in a <block> are finished before
passing to the next element or can it proceed immediately they have been
queued? (which then implies that an input may be collected during the
playing of a prompt that is in a <block>)

7/ FIA, appendix C
 - in the FIA, for the collection of active grammars when not modal, it says
that these include "elements up the <subdialog> call chain." This seems to
be in contradiction with the section on <subdialog> which says each
subdialog has a totally separate context from the caller, and
shares/inherits absolutely no elements with it.
   -> is it the FIA that is wrong?

8/ ASR Result semantic interpretation
 - is it any clearer what will actually be in the spec on this subject, and
particularly on the link between NLSML, tags in the grammar XML and the
mapping to the voiceXML? There seems to be a lot of italics marking issues
in this area....

9/ Embedded XML (grammars and TTS text SSML tags)
 - this seems complex to have to either include the DTDs for GRXML and SSML
in the VoiceXML DTD, to be able to parse these sections, or to use CDATAs
round them.
  -> Is there no way to mark them as "yes, its XML but don't parse it cause
its not in this DTD"? (CDATA doesn't make the differentiation between any
old text and XML)
  -> or to specify a different DTD for these sections?

Thanks for any help on these points...

Brian

[Brian Wyld] [brian.wyld@eloquant.com]
[Eloquant SA] [+33 476 77 46 92] [www.eloquant.com]
[advanced solutions for telecoms and IT services]
Received on Friday, 5 April 2002 03:00:19 UTC