- From: Robert Brown <Robert.Brown@microsoft.com>
- Date: Wed, 14 Sep 2011 01:01:26 +0000
- To: Satish S <satish@google.com>, "olli@pettay.fi" <olli@pettay.fi>, "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
- Message-ID: <113BCF28740AF44989BE7D3F84AE18DD1B2565E4@TK5EX14MBXC112.redmond.corp.microsoft.>
I'm also struggling with sections 3 & 4 - SpeechService and SpeechServiceQuery. Sorry for not chiming in earlier. While I can see the direction this is going, it just feels way too complicated to me. I think it will be a lot more work to iron out the details, but in the end won't make for an API that's easy to use. Personally I'd prefer to take a simplified approach. Something like this... Firstly, make it as easy as possible to use the built-in speech capabilities of a UA just by creating the SpeechInputRequest and SpeechOutputRequest objects, without any messing about with services and criteria and queries. Something like this: function simplestCase() { // just give me the default recognizer and synthesizer: simplestSR = new SpeechInputRequest(); simplestTTS = new SpeechOutputRequest(); } Secondly, for cases where the UA has access to variety of different speech engines, rather than create a Query API and a Criteria API, just provide mandatory parameters and optional parameters as strings in the constructors for SpeechInputRequest and SpeechOutputRequest. The constructor pattern would be something like this: [Constructor(DOMString? mandatoryparams, optional DOMString? optionalparams)] The usage would be something like this: function aLittleBitFussy() { // Give me a recognizer for Australian or British English, // with grammars for dictation and datetime. // It should preferably model a child's vocal tract, but doesn't need to. fussySR= new SpeechInputRequest("language=en-AU|en-GB;grammars=<builtin:dictation>,<builtin:datetime>", "age=child"); // Give me a synthesizer. It must be Swedish. // If the voice named "Kiana" is installed, please use it. // Otherwise, I'd prefer a voice that at least sounds like a woman in her thirties, if you have one. fussyTTS = new SpeechOutputRequest("language=sv-SE", "name=Kiana;gender=female;age=30-40"); } Thirdly, only use a SpeechService object for actual services that aren't built-in to the UA. In this case we should model existing WebSockets and XHR patterns to initialize the service, and then use the service object as a parameter to the constructors for SpeechInputRequest and SpeechOutputRequest. And drop the Query object entirely. Usage would be something like this: var ssvc; function initService() { //open a new service ssvc = new SpeechService("https://myspeechservice/?account=a84e-2198-4e60-00f3"); ssvc.onopen = function () { //check that it has the characteristics we expected... //will it recognize en-AU or en-GB, and speak Swedish? if ((ssvc.getSupportedLanguages("recognition", "en-AU,en-GB") == '') || (ssvc.getSupportedLanguages("synthesis", "en-AU,en-GB") == '') //does it have the right grammars? || (ssvc.getSupportedGrammars("<builtin:dictation>,<builtin:us-cities>") == '')) { //no? okay, close it - we don't want it ssvc.close(); ssvc.onclose = function () { ssvc = null; } return; } } //get SR and TTS request objects using the service: serviceSR = new SpeechInputRequest(ssvc); serviceTTS = new SpeechOutputRequest(ssvc); }
Received on Wednesday, 14 September 2011 01:01:58 UTC