VoiceXML 2.1 Feature Status, August 7, 2003

At the [REDMOND_F2F], as a result of reviewing [FEASIBILITY], the Voice Browser Working Group voted to pursue a VoiceXML 2.1 specification on the W3C Recommendation Track as described in [W3C_PROCESS]. The first 6 features listed in the table below were presented at [REDMOND_F2F] as part of [FEASIBILITY]. The additional 13 features in the table were submitted by several VBWG members subsequently.

The columns in the table correspond to the criteria agreed upon at [REDMOND_F2F]:


Id Feature Impls Doc Use case Orthogonal
1 Reference grammars dynamically by adding an expr attribute to <grammar>. 4 (BeVocal, Nuance, Tellme, VoiceGenie, VoxPilot) Yes. See 3.1 in [FEASIBILITY]. Yes. Enables parameterized referencing of grammars dynamically w/o server-side round trip. Yes. Adds attribute to <grammar>.
2 Detect where barge-in occurred in prompt playback using SSML <mark> and shadow variables. 3 (BeVocal, Tellme, VoiceGenie) Yes. See 3.2 in [FEASIBILITY]. Yes. Enables programmatic access to barge-in data. Yes. Leverage SSML <mark> and new properties on application.lastresult$.
3 Reference scripts dynamically by adding an expr attribute to <script>. 4 (BeVocal, Tellme, VoiceGenie, VoxPilot) Yes. See 3.3 in [FEASIBILITY]. Yes. Enables parameterized referencing of scripts dynamically w/o server-side round trip. Yes. Adds attribute to <script>.
4 Enable parameterized fetching of XML without requiring a dialog transition by adding <data>. 2 (BeVocal, Tellme) Yes. See 3.4 in [FEASIBILITY] and [DATA]. Yes. Enables seamless integration with enterprise back-end XML data while maintaining independence from service logic. Improve page cacheability. Yes. Adds <data> element. Clean programmatic access via ECMAScript.
5 Allow dynamic prompt concatenation through automatic iteration over an array. 2 (BeVocal, Nuance, Tellme) Yes. See 3.5 in [FEASIBILITY]. Yes. Enables developers to naturally concatenate prompts dynamically without requiring a state transition or a round-trip to the server for a complete VoiceXML document Impoves page cacheability. Yes, if the second option (3.5.2) is selected from [FEASIBILITY].
6 Record user utterances while attempting recognition. 4 (BeVocal, Nuance, Tellme, Speechworks, VoxPilot) Yes. See 3.6 of [FEASIBILITY]. Yes. Enables conditional recording of an utterance while perform recognition. Yes. Adds a standard name/value to the <property> element and three additional properties to application.lastresult$.
7 Improve CCXML integration by adding namelist attribute to <disconnect> Yes. 3 (BeVocal, Loquendo, Vocalocity) Yes. See attachment "V2.1 Loquendo Proposal1.html " in [LOQUENDO_PR1]. Yes. Improve integration with [CCXML]. Yes. Add attribute to <disconnect>.
8 Add type attribute to <transfer> to support transfer types other than 'bridged' and 'blind' including a supervised blind transfer. Yes. 3+ (BeVocal, Loquendo, Nuance, VoiceGenie) Yes. See attachment "Proposal4 Complete.zip" in [LOQUENDO_PR1]. Yes. Supervised blind transfer would allow better 'error' handling / caller experience Yes. Add attribute to <transfer>.
  Comments:
  • Concern that some telephony transport protocols may not support the feature.
9 Allow a recording to be appended. Unspecified. Yes. See attachment "record.html" in [COMVERSE_PR0] No. The alternative is to submit each recording to an HTTP server and perform the append on the server side. Yes. Add attributes to <record>.
  Comments:
  • The feature requires the VoiceXML programmer to specify a destination to a local file.
  • Several security concerns were raised regarding this requirement: ([0], [1], [2]).
10 Allow an audio file to be played back beginning from an offset. Yes. (Nuance, VoiceGenie, and Voxpilot) Yes. See attachment "audio_offset.html" in [REHOR_PR0]. No. The alternative is to use audio expr to reference a server-side script passing the desired offset as a parameter. No. the markup cannot be translated into valid [SSML].
  Comments:
  • Should not be exposed via ([3],[4], [5])
  • Should only apply to
  • Breaks streaming of content without user perceived latency, a core requirement.
  • Can't assume use of feature would only apply to cached or prefetched content. ([6], [7], [8])
11 Add cond attribute to <log>. Yes. 2 (Loquendo, VoiceGenie) Yes. See attachment "V2.1 Loquendo Proposal2.html" in [LOQUENDO_PR1]. No. The alternative is to wrap the <log> in an <if>. [9] Yes. Add attribute to <log>.
  Comments:
  • Wrap <log> in <if>. [13]
12 Add fetchaudio attribute to <object>. No. 1 (Loquendo) Yes. See attachment "V2.1 Loquendo Proposal3.html" in [LOQUENDO_PR1]. Yes. For parity with other elements that fetch content (e.g. subdialog) Yes. Add attribute to <object>.
13 Speaker verification/identification. 1 (Nuance) None submitted.1 Yes. See [NUANCE_PR0]. Yes. Adds one element and some properties to application.lastresult$.
14 Voice enrolled grammars. 1 (Nuance) None submitted.1 Yes. See [NUANCE_PR0]. Yes. Adds three elements.
15 Simple dynamic grammar generation. 1 (Nuance) None submitted.1 Yes. See [NUANCE_PR0]. Maybe. Adds no new elements (just adds additional locations where an existing element can occur), but extends inline grammars.
16 Re-recognition 1 (Nuance) None submitted.1 Yes. See [NUANCE_PR0]. Yes. Adds attribute to <field>
17 Task completion tags 1 (Nuance) None submitted.1 Yes. See [NUANCE_PR0]. Yes. Adds two elements.
18 Source in properties. 1 (Nuance) None submitted.1 Yes. See [NUANCE_PR0]. Yes. Adds attribute to .
19 nbest 1 (Nuance) None submitted.1 Yes. See [NUANCE_PR0]. Maybe. Adds one new element within inline grammars.

1 Internal specifications and product documentation available from Nuance.

References
[CCXML] Voice Browser Call Control: CCXML Version 1.0 June 2003.
[COMVERSE_PR0] VoiceXML 2.1 Record Append Proposal June 2003.
[DATA] data<DATA> proposal May 2001.
[FEASIBILITY] VoiceXML 2.1 Feasibility Study June 2003.
[REHOR_PR0] Audio Offset proposal for VoiceXML 2.1 June 2003.
[REDMOND_F2F] Voice Browser Working Group F2F, Redmond, WA June 2003.
[SSML] Speech Synthesis Markup Language December 2002.
[LOQUENDO_PR0] Loquendo VoiceXML 2.1 feature requests 0 June 2003.
[LOQUENDO_PR1] Loquendo VoiceXML 2.1 feature requests 1 July 2003.
[NUANCE_PR0] Nuance VoiceXML 2.1 feature requests July 2003.
[W3C_PROCESS] W3C Process Document June 2003.

[0] http://lists.w3.org/Archives/Member/w3c-voice-wg/2003Jun/0050.html June 2003.
[1] http://lists.w3.org/Archives/Member/w3c-voice-wg/2003Jun/0052.html June 2003.
[2] http://lists.w3.org/Archives/Member/w3c-voice-wg/2003Jun/0054.html June 2003.
[3] http://lists.w3.org/Archives/Member/w3c-voice-wg/2003Jun/0075.html June 2003.
[4] http://lists.w3.org/Archives/Member/w3c-voice-wg/2003Jun/0077.html June 2003.
[5] http://lists.w3.org/Archives/Member/w3c-voice-wg/2003Jun/0078.html June 2003.
[6] http://lists.w3.org/Archives/Member/w3c-voice-wg/2003Jun/0079.html June 2003.
[7] http://lists.w3.org/Archives/Member/w3c-voice-wg/2003Jun/0083.html June 2003.
[8] http://lists.w3.org/Archives/Member/w3c-voice-wg/2003Jun/0106.html June 2003.
[9] http://lists.w3.org/Archives/Member/w3c-voice-wg/2003Jun/0118.html June 2003.