- From: Dominic Mazzoni <dmazzoni@google.com>
- Date: Thu, 01 Oct 2015 19:05:32 +0000
- To: Aaron Brewer <spaceribs@gmail.com>, public-speech-api@w3.org
- Message-ID: <CAFz-FYyiAZRhEMJh9qGFLCRYNqPVeMSwFo+9qrtU_0+FJ7n+_Q@mail.gmail.com>
Aaron, The web speech API is implementation-independent. Each speech engine has its own concept of a phoneme, and many of them don't actually expose any information about phoneme boundaries or phoneme events. Similarly for "word", each engine breaks up text into words differently, for example some engines consider a hyphenated word like implementation-independent to be two words, others consider it to be one word. Also, some fire events at the start of a word, some at the end of a word, and some both. In order to abstract over those differences, the web speech API gives you a single boundary event. It makes sense to support boundary events for phonemes, when the speech engine supports it. - Dominic On Thu, Oct 1, 2015 at 12:22 AM Aaron Brewer <spaceribs@gmail.com> wrote: > Hi Everyone, > > I played around with the speech synthesis API implementation in Chrome > last night, it’s pretty great stuff. The lack of SSML support is troubling > and I hope that gets resolved soon, and it seems like getVoices() is > handled differently in every browser I’ve tested, but it’s getting there. > > I wanted to put in a request for a new event called “onphoneme” and/or > “onword”. It would be excellent to have this feature for lip-syncing as > hacking “onboundary” can only get you so far. The data structure I imagine > that would be passed for these events could be modeled from phonemenon ( > https://github.com/jimkang/phonemenon) which gives the phoneme or array > of phonemes and their stress level. I know we’re still in the early days of > this API, but I feel like having an “onboundary” event and not having an > event for when words themselves are spoken is counterintuitive. > > - Aaron >
Received on Thursday, 1 October 2015 19:06:10 UTC