VoiceXML 2.0 Status Update, August 7, 2003

VoiceXML 2.1 Feature Status, August 7, 2003

At the [REDMOND_F2F], as a result of reviewing [FEASIBILITY], the Voice Browser Working Group voted to pursue a VoiceXML 2.1 specification on the W3C Recommendation Track as described in [W3C_PROCESS]. The first 6 features listed in the table below were presented at [REDMOND_F2F] as part of [FEASIBILITY]. The additional 13 features in the table were submitted by several VBWG members subsequently.

The columns in the table correspond to the criteria agreed upon at [REDMOND_F2F]:

Must already be implemented by at least two companies (Implementation Report requirements).
Documentation must already exist (complete specification).
Important use-case that cannot be implemented using VoiceXML 2.0.
Orthogonal to VoiceXML 2.0.

Id	Feature	Impls	Doc	Use case	Orthogonal
1	Reference grammars dynamically by adding an expr attribute to <grammar>.	4 (BeVocal, Nuance, Tellme, VoiceGenie, VoxPilot)	Yes. See 3.1 in [FEASIBILITY].	Yes. Enables parameterized referencing of grammars dynamically w/o server-side round trip.	Yes. Adds attribute to <grammar>.
2	Detect where barge-in occurred in prompt playback using SSML <mark> and shadow variables.	3 (BeVocal, Tellme, VoiceGenie)	Yes. See 3.2 in [FEASIBILITY].	Yes. Enables programmatic access to barge-in data.	Yes. Leverage SSML <mark> and new properties on application.lastresult$.
3	Reference scripts dynamically by adding an expr attribute to <script>.	4 (BeVocal, Tellme, VoiceGenie, VoxPilot)	Yes. See 3.3 in [FEASIBILITY].	Yes. Enables parameterized referencing of scripts dynamically w/o server-side round trip.	Yes. Adds attribute to <script>.
4	Enable parameterized fetching of XML without requiring a dialog transition by adding <data>.	2 (BeVocal, Tellme)	Yes. See 3.4 in [FEASIBILITY] and [DATA].	Yes. Enables seamless integration with enterprise back-end XML data while maintaining independence from service logic. Improve page cacheability.	Yes. Adds <data> element. Clean programmatic access via ECMAScript.
5	Allow dynamic prompt concatenation through automatic iteration over an array.	2 (BeVocal, Nuance, Tellme)	Yes. See 3.5 in [FEASIBILITY].	Yes. Enables developers to naturally concatenate prompts dynamically without requiring a state transition or a round-trip to the server for a complete VoiceXML document Impoves page cacheability.	Yes, if the second option (3.5.2) is selected from [FEASIBILITY].
6	Record user utterances while attempting recognition.	4 (BeVocal, Nuance, Tellme, Speechworks, VoxPilot)	Yes. See 3.6 of [FEASIBILITY].	Yes. Enables conditional recording of an utterance while perform recognition.	Yes. Adds a standard name/value to the <property> element and three additional properties to application.lastresult$.
7	Improve CCXML integration by adding namelist attribute to <disconnect>	Yes. 3 (BeVocal, Loquendo, Vocalocity)	Yes. See attachment "V2.1 Loquendo Proposal1.html " in [LOQUENDO_PR1].	Yes. Improve integration with [CCXML].	Yes. Add attribute to <disconnect>.
8	Add type attribute to <transfer> to support transfer types other than 'bridged' and 'blind' including a supervised blind transfer.	Yes. 3+ (BeVocal, Loquendo, Nuance, VoiceGenie)	Yes. See attachment "Proposal4 Complete.zip" in [LOQUENDO_PR1].	Yes. Supervised blind transfer would allow better 'error' handling / caller experience	Yes. Add attribute to <transfer>.
	Comments: Concern that some telephony transport protocols may not support the feature.
9	Allow a recording to be appended.	Unspecified.	Yes. See attachment "record.html" in [COMVERSE_PR0]	No. The alternative is to submit each recording to an HTTP server and perform the append on the server side.	Yes. Add attributes to <record>.
	Comments: The feature requires the VoiceXML programmer to specify a destination to a local file. Several security concerns were raised regarding this requirement: ([0], [1], [2]).
10	Allow an audio file to be played back beginning from an offset.	Yes. (Nuance, VoiceGenie, and Voxpilot)	Yes. See attachment "audio_offset.html" in [REHOR_PR0].	No. The alternative is to use audio expr to reference a server-side script passing the desired offset as a parameter.	No. the markup cannot be translated into valid [SSML].
	Comments: Should not be exposed via ([3],[4], [5]) Should only apply to Breaks streaming of content without user perceived latency, a core requirement. Can't assume use of feature would only apply to cached or prefetched content. ([6], [7], [8])
11	Add cond attribute to <log>.	Yes. 2 (Loquendo, VoiceGenie)	Yes. See attachment "V2.1 Loquendo Proposal2.html" in [LOQUENDO_PR1].	No. The alternative is to wrap the <log> in an <if>. [9]	Yes. Add attribute to <log>.
	Comments: Wrap <log> in <if>. [13]
12	Add fetchaudio attribute to <object>.	No. 1 (Loquendo)	Yes. See attachment "V2.1 Loquendo Proposal3.html" in [LOQUENDO_PR1].	Yes. For parity with other elements that fetch content (e.g. subdialog)	Yes. Add attribute to <object>.
13	Speaker verification/identification.	1 (Nuance)	None submitted.₁	Yes. See [NUANCE_PR0].	Yes. Adds one element and some properties to application.lastresult$.
14	Voice enrolled grammars.	1 (Nuance)	None submitted.₁	Yes. See [NUANCE_PR0].	Yes. Adds three elements.
15	Simple dynamic grammar generation.	1 (Nuance)	None submitted.₁	Yes. See [NUANCE_PR0].	Maybe. Adds no new elements (just adds additional locations where an existing element can occur), but extends inline grammars.
16	Re-recognition	1 (Nuance)	None submitted.₁	Yes. See [NUANCE_PR0].	Yes. Adds attribute to <field>
17	Task completion tags	1 (Nuance)	None submitted.₁	Yes. See [NUANCE_PR0].	Yes. Adds two elements.
18	Source in properties.	1 (Nuance)	None submitted.₁	Yes. See [NUANCE_PR0].	Yes. Adds attribute to .
19	nbest	1 (Nuance)	None submitted.₁	Yes. See [NUANCE_PR0].	Maybe. Adds one new element within inline grammars.