Re: R9. Web application author provided synthesis feedback from Daniel Weck on 2010-12-08 (public-xg-htmlspeech@w3.org from December 2010)

From: Daniel Weck <daniel.weck@gmail.com>
Date: Wed, 8 Dec 2010 13:06:31 +0000
To: Bjorn Bringert <bringert@google.com>
Cc: Olli@pettay.fi, public-xg-htmlspeech@w3.org
Message-Id: <1445888C-18EB-4E04-82FE-49AB8E20DDD3@gmail.com>

On 8 Dec 2010, at 11:59, Bjorn Bringert wrote:
> I propose that we do the same for TTS, and replace R9 with the  
> following:
>
> 1. The web app should be notified when TTS playback starts.
> 2. The web app should be notified when TTS playback finishes.
> 3. The web app should be notified when the audio corresponding to a
> TTS <mark> element is played back.
>
> Are any other TTS events needed?

Hi, I have been observing discussions on this list for a couple of  
weeks only, so I hope I am not totally off-the-mark :) Here's my input:

With Microsoft Speech API (SAPI), a programmer can register interest  
for events such as a change of voice, word and sentence boundaries,  
volume levels, speech rate, etc. The Java Speech API provides some  
access to the currently-executing queue of processing events (e.g.  
phonemes within an audio stream). I am not familiar with Apple's  
NSSpeechSynthesizer, but I assume applications can listen to speech  
events too.

Given how SSML and CSS-Speech may influence parts of the text-to- 
speech synthesis, R9 should probably reflect the need for a web app to  
be notified of a greater range of events than the 3 proposed ones  
(voice and rate change strike me as useful information). I can also  
see that the concept of "playback start" needs to be refined to cater  
for the latency between a call to speak(TXT) and the actual delivery  
of bytes to the output device ("start" usually means when bytes are  
effectively sent to the output device).

Just a thought. I'll be following the discussions to learn more about  
this group's design goals. In the meantime, feel free to correct me if  
I'm wrong ;)

Cheers, Daniel

References:

http://msdn.microsoft.com/en-us/library/ms717254(v=VS.85).aspx

http://developer.apple.com/library/mac/#documentation/Cocoa/Reference/ 
ApplicationKit/Classes/NSSpeechSynthesizer_Class/Reference/ 
Reference.html

http://download.oracle.com/docs/cd/E17802_01/products/products/java-media/speech/forDevelopers/jsapi-doc/javax/speech/synthesis/package-summary.html

Received on Wednesday, 8 December 2010 13:07:09 UTC