W3C home > Mailing lists > Public > www-voice@w3.org > July to September 2005

TTS & Asynchronous Events....!

From: Sheth Raxit <raxit@phonologies.com>
Date: Wed, 14 Sep 2005 15:13:34 +0530
Message-Id: <200509140943.j8E9hY1l023804@mail25.atl.registeredsite.com>
To: <www-voice@w3.org>


Is Current TTS Engines (and SSML)  Support Asynchronous Events?

Example :
Scenario :  READING(HEARING) some big documents on Phone
(for Example Documetns having 100 statements , having 10 Paragraphs)

System Involves mainly VoiceXml Engine,TTS,ASR and other Telephony component

1. TTS get whole Document in SSML form.
2. TTS Engine Starts Rendering Document
3. User is Reading(!)(Listening)  the Document on Phone (or any Voice/Speech device)
4. After Reading 5 statements user is feeling Voice output is Fast and wants to slow down the speed
5. User gives Command : "SLOW DOWN" or simillar
6. ASR/DTMF Engine informs  Voicexml Engine 
7. Voicexml Engine Analyse command
8. Voicexml Engine Generate Appropriate Command to Send to TTS ( This command should  be in form of XML and conforming to some standard) and 
9. TTS Engine decrease the  Speed.

Although SSML  Supports ( & TTS Implements)  tags to Set the Speed etc, It is difficult to Change the speed after dialog is started.

The Important point is to Realize that as Every person's Speech is having some Set of Properties, and these properties are vary person to person and highly affected by local environment (noisy environment, Mood of user etc...) the same way one's Listening Habits/Properties also differ and greatly affected by Local Environment.

Many times user want to CHANGE the  SPEED, REPEAT Few Words/lines ( VoiceScrolling), Want to SKIP some words/lines, want to HOLD the line for some time, change the VOLUME or  other related Properties of Voice...

Currently all/some of  the above cases are Handled by mainly Dynamically Generated VoiceXml.( or Not handled at all...!!!  Users Not in Control...but Application is in Control)

Instead of that if TTS Engine is Responsible for handling the stuff like these, the Collaboration between Voicexml engine, ASR and TTS would seem to be more natural,And Conversation between user and System would be more USER DRIVEN and  User Friendly.

Also this change might propogate to enhance  SSML(some  tags to add/Modify current tags of SSML).

Waiting for Further Response....

Thanks & Regards,
Raxit Sheth
Phonologies India Pvt Ltd.
Tel: +91-22-22029732
Fax: +91-22-22029728

Sheth Raxit
Systems Software Engineer


Phonologies (India) Private Limited
G-46 Dhanraj Mahal, Chh Shivaji Marg, Mumbai 39. INDIA.
Ph:+91-22-22029732  Fax:+91-22-22029728 mail@phonologies.com

****The information in this email is confidential and may be legally
privileged. It is intended solely for the addressee. Access to this email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in  reliance on it, is prohibited and may be unlawful****
Received on Wednesday, 14 September 2005 12:45:04 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:07:38 UTC