- From: Charles Pritchard <chuck@jumis.com>
- Date: Wed, 25 Apr 2012 12:35:00 -0700
- To: Dominic Mazzoni <dmazzoni@google.com>
- CC: Jim Barnett <Jim.Barnett@genesyslab.com>, Hans Wennborg <hwennborg@google.com>, public-speech-api@w3.org
- Message-ID: <4F9851E4.1080007@jumis.com>
On 4/25/2012 7:57 AM, Dominic Mazzoni wrote: > On Wed, Apr 25, 2012 at 7:32 AM, Jim Barnett > <Jim.Barnett@genesyslab.com <mailto:Jim.Barnett@genesyslab.com>> wrote: > > I could imagine a situation in which a page invoked multiple > distinct TTS engines (expertise in different languages being one > common use case), so I wouldn't want the TTS object to be unique, > but I think it would make sense to have a single TTS object for > each engine and then have a method like 'addUtterance' with the > kind of behavior that Dominic mentioned (queue vs abort, plus the > possibility for different voices/parameters for each utterance.) > > > I agree about wanting to use multiple engines, but why not just make > that a parameter? Unless you wanted two engines talking *at the same > time*, I don't see any reason you need a separate instance per engine. > > I can see it working where there's a single global TTS object and > everything is done via method calls. That's what we did for the Chrome > TTS extension API. I can also see it working to create one object per > utterance, because a typed JavaScript object is a convenient container > for state. But somewhere in-between (multiple TTS objects per engine, > but not one object per utterance) seems overcomplicated. I'd like to pursue this from a different perspective. Let's think about speakers (as agents) instead of "text-to-speech". var a = new Speaker({id: 1, title: 'Alice'}); var b = new Speaker({id: 1, title: 'Bob'}); var c = new Speaker({id: 2, title: 'Chuck'}); Now we've got utterance groups, and titles. In many cases, the speaker won't have a name or group, because it's just a simple notification service between the app and the user. But, we've got the other side of things, where it could be a chat room or a complex service. Consider the following pseudo-code: a.speak("Hello everyone"); b.speak("Hi!"); c.speak("Hello"); a.speak("And"); a.speak("bonjour", {lang: 'fr'}); a.speak("to our audience"); a.onword = function(e) { if(e.data=='audience') b.speak("Perhaps you mean captives", {instant: true}); }; c.speak("I can not speak because you cut my mic."); c.onword = function(e) { if(e.data=='because') c.clear(); }; Chuck can't interrupt anyone, Bob interrupts Alice while she's speaking the word "audience". The "queue" concept is still a little flaky in this example. Having multiple objects is closer to the evolution of other APIs. The BlobBuilder API has been moved over to an array literal-based "Blob" object instantiation. So that's where I base some of the concept. Callbacks on the object are much preferred over callbacks passed via argument. -Charles
Received on Wednesday, 25 April 2012 19:35:25 UTC