Re: Speech synthesis events from Dominic Mazzoni on 2015-10-01 (public-speech-api@w3.org from October 2015)

From: Dominic Mazzoni <dmazzoni@google.com>
Date: Thu, 01 Oct 2015 19:05:32 +0000
To: Aaron Brewer <spaceribs@gmail.com>, public-speech-api@w3.org
Message-ID: <CAFz-FYyiAZRhEMJh9qGFLCRYNqPVeMSwFo+9qrtU_0+FJ7n+_Q@mail.gmail.com>

Aaron,

The web speech API is implementation-independent. Each speech engine has
its own concept of a phoneme, and many of them don't actually expose any
information about phoneme boundaries or phoneme events. Similarly for
"word", each engine breaks up text into words differently, for example some
engines consider a hyphenated word like implementation-independent to be
two words, others consider it to be one word. Also, some fire events at the
start of a word, some at the end of a word, and some both. In order to
abstract over those differences, the web speech API gives you a single
boundary event.

It makes sense to support boundary events for phonemes, when the speech
engine supports it.

- Dominic

On Thu, Oct 1, 2015 at 12:22 AM Aaron Brewer <spaceribs@gmail.com> wrote:

> Hi Everyone,
>
> I played around with the speech synthesis API implementation in Chrome
> last night, it’s pretty great stuff. The lack of SSML support is troubling
> and I hope that gets resolved soon, and it seems like getVoices() is
> handled differently in every browser I’ve tested, but it’s getting there.
>
> I wanted to put in a request for a new event called “onphoneme” and/or
> “onword”. It would be excellent to have this feature for lip-syncing as
> hacking “onboundary” can only get you so far. The data structure I imagine
> that would be passed for these events could be modeled from phonemenon (
> https://github.com/jimkang/phonemenon) which gives the phoneme or array
> of phonemes and their stress level. I know we’re still in the early days of
> this API, but I feel like having an “onboundary” event and not having an
> event for when words themselves are spoken is counterintuitive.
>
> - Aaron
>

Received on Thursday, 1 October 2015 19:06:10 UTC