W3C home > Mailing lists > Public > public-speech-api@w3.org > September 2012

Re: Proposal to add start, stop, and update events to TTS

From: Dominic Mazzoni <dmazzoni@google.com>
Date: Tue, 18 Sep 2012 15:38:03 -0700
Message-ID: <CAFz-FYxbYyCAcGPemb66_+SoBi2T_NFRg=91QMa_AuHTVFMiZQ@mail.gmail.com>
To: Glen Shires <gshires@google.com>
Cc: public-speech-api@w3.org
Some clarification:

1. In other specs I saw that callback functions had explicitly
specified types, even if they take no arguments. But I have no
preference - if it's sufficient to just say Function onstart, then I'm
happy.

2. Rather than separate callbacks for word, sentence, character,
marker, etc. I suggested a single 'onupdate' callback with arguments
for marker, start time, charIndex, etc. - the reason is that different
speech engines support different types of updates, and trying to force
them all to have the same definition of "word", etc. is problematic.
This allows for future extensions to the API and also makes it easier
for clients to adapt to different speech engines.

- Dominic

On Tue, Sep 11, 2012 at 4:06 PM, Dominic Mazzoni <dmazzoni@google.com> wrote:
> Sorry for being unclear. What I meant was to just extend the onstart
> and onend events already in the spec.
>
> - Dominic
>
> On Tue, Sep 11, 2012 at 2:48 PM, Glen Shires <gshires@google.com> wrote:
>> Is this preferable to:
>>
>> interface SpeechSynthesisUtterance {
>>       ...
>>       attribute Function onstart;
>>       attribute Function onend;
>>
>>
>> On Tue, Sep 11, 2012 at 1:19 AM, Dominic Mazzoni <dmazzoni@google.com>
>> wrote:
>>>
>>> I propose adding a mechanism by which JavaScript clients can attach
>>> event listeners to an utterance to get notified when it starts
>>> speaking, when it finishes, and optionally with updates of the
>>> progress.
>>>
>>> Here's what could be added to the spec:
>>>
>>> callback SpeechSynthesisStartCallback = void();
>>> callback SpeechSynthesisEndCallback = void();
>>> callback SpeechSynthesisUpdateCallback = void(DOMString markerName,
>>> unsigned long charIndex, float elapsedTime);
>>>
>>> interface SpeechSynthesisUtterance {
>>>   ...
>>>   attribute SpeechSynthesisStartCallback onstart;
>>>   attribute SpeechSynthesisEndCallback onend;
>>>   attribute SpeechSynthesisUpdateCallback onupdate;
>>> }
>>>
>>> I propose that "onstart" and "onend" must be supported for an
>>> implementation to be fully compliant. There are too many applications
>>> that can't be implemented without these.
>>>
>>> I think "onupdate" should be optional because it depends on what's
>>> possible in the speech engine. If the speech engine provides
>>> word-level information, for example, then every time there's a break
>>> between works it'd call onupdate() with the character index (into the
>>> original utterance string) and the elapsed time since speech began. An
>>> engine might also notify when certain named "markers" (e.g. in SSML)
>>> are reached.
>>>
>>> Other ideas - in the Chrome TTS extension API I also implemented these
>>> events - what do you think?
>>>
>>> * Error
>>> * Cancelled (stopAndFlushQueue called before it ever started playing)
>>> * Interrupted (stop called while it was in the middle of speaking)
>>>
>>> - Dominic
>>>
>>
Received on Tuesday, 18 September 2012 22:38:30 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:02:28 UTC