- From: Daniel O'Sullivan <dan@voicexl.com>
- Date: Wed, 21 Jul 2004 15:04:12 -0500
- To: <www-voice@w3.org>
- Message-ID: <BD24366C.15F6%dan@voicexl.com>
Dear Committee Members: I understand from Jim Larsen that work is to begin in the near future on a specification for VoiceXML Version 3. I further understand that the Committee is interested in incorporating new features for the VoiceXML standard in that specification. I strongly urge the committee to select audio playback speed adjustment as one of those new features for the VoiceXML Version 3 specification for the following reasons: 1. While the SSML and SALT specifications both support dynamic playback speed adjustment of TTS messages, there is currently no equivalent in VoiceXML. Further, not all TTS engines provide a smooth playback output with variable playback rates. Dynamic adjustment of existing pre-recorded audio files on the other hand provides a very smooth, high quality, pitch adjusted output. 2. The feature is already supported at the hardware and API level by the dominant voice board manufacturers including Intel-Dialogic and NMSS. 3. The feature allows for convenient tuning of an applications voice files in real time, even long after the application has been tested in production for many years. 4. The feature would make it easier for the VoiceXML developer community to implement our real time adaptive algorithm which has been field tested and proven to enhance the caller interface, shorten call duration and encourage the use of IVR resources. A white paper on our technology as currently implemented for VoiceXML platforms is attached to this email. The new feature could be added as a simple tag for audio play events. The tag would specify whether the message segment is to be played at normal or some positive (increase) or negative (decrease) value with respect to the static, recorded playback speed of the segment. There are a variety of alternative technologies to support this feature, all of which are available royalty free in the public domain. These include: a) Adding "power user" menus to allow callers to select the level of instruction they receive from the IVR application. b) off-line editing of voice files to reduce silence at the beginning and end of each message segment. c) redesign of the application call script to maximize efficiency and reduce ambiguities. I would be happy to answer any questions the committee members may have regarding this feature and how it will benefit the VoiceXML community as a whole. Thank you for considering this proposal. Sincerely, Daniel OšSullivan President/CEO Interactive Digital, Inc. dan@voicexl.com www.voicexl.com (631) 724-2323 direct (631) 680-4307 mobile
Attachments
- application/octet-stream attachment: VoiceXLVXMLWhitePaper.pdf
Received on Wednesday, 21 July 2004 15:05:21 UTC