- From: Robert Brown <Robert.Brown@microsoft.com>
- Date: Thu, 15 Sep 2011 00:34:01 +0000
- To: Satish S <satish@google.com>, Bjorn Bringert <bringert@google.com>
- CC: "olli@pettay.fi" <olli@pettay.fi>, "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
- Message-ID: <113BCF28740AF44989BE7D3F84AE18DD1B257858@TK5EX14MBXC112.redmond.corp.microsoft.>
>> how does the app find out whether the SpeechInput/OutputRequest is usable (speech service exists and supports the mandatory parameters) Good question. The constructor could throw an exception if it can't fulfill the mandatory parameters? try { fussySR= new SpeechInputRequest(someparams); } catch (e) { alert("don't talk to me, I'm not listening because " + e.message); } If we like this, perhaps we could also define some error codes. E.g. try { fussySR= new SpeechInputRequest(someparams); } catch (e) { switch (e.number) { case 100: // unsupported language code alert("sorry, I don't speak your language"); break; case 200: // unsupported grammar alert("sorry, I don't understand what this app wants me to listen for"); break; default: alert("don't talk to me, I'm not listening because " + e.description); } } >> Not sure about the use of structured strings for parameters >> turn all the grammar/language strings into array parameters and array properties in true JS form. Yeah, the strings are ugly. I like the array idea. We just need to find a way to express optional and mandatory params for different things without producing a constructor with a zillion parameters. i.e. not this: [Constructor(DOMString[] mandatorylangs, DOMString[] optionallangs, DOMString[] mandatorygrammars, DOMString[] optionalgrammars, DOMString[] blahblah, DOMString[] optionalyadayada )] From: public-xg-htmlspeech-request@w3.org [mailto:public-xg-htmlspeech-request@w3.org] On Behalf Of Satish S Sent: Wednesday, September 14, 2011 1:53 AM To: Robert Brown Cc: olli@pettay.fi; public-xg-htmlspeech@w3.org Subject: Re: A few high level thoughts about the web api sections 3 and 4. Thanks Robert, that does look simpler. One change I'd suggest is to turn all the grammar/language strings into array parameters and array properties in true JS form. So function aLittleBitFussy() { // Give me a recognizer for Australian or British English, // with grammars for dictation and datetime. // It should preferably model a child's vocal tract, but doesn't need to. fussySR= new SpeechInputRequest(["en-AU", "en-GB"], ["<builtin:dictation>","<builtin:datetime>"], ["age=child"]); // Give me a synthesizer. It must be Swedish. // If the voice named "Kiana" is installed, please use it. // Otherwise, I'd prefer a voice that at least sounds like a woman in her thirties, if you have one. fussyTTS = new SpeechOutputRequest("sv-SE", false, 35, ["name=Kiana"]); } ------ // will it recognize en-AU or en-GB, and speak Swedish? if ((ssvc.recognitionLanguages.indexOf("en-AU") == -1 || ssvc.recognitionLanguages.indexOf("en-GB") == -1 || ssvc.synthesisLanguages.indexOf("sv-SE") == -1 || // does it have the right grammars? ssvc.grammars.indexOf("<builtin:dictation>") == -1 || ssvc.grammars.indexOf("<builtin:us-cities>") == -1) { //no ? okay, close it - we don't want it. ... } Cheers Satish On Wed, Sep 14, 2011 at 2:01 AM, Robert Brown <Robert.Brown@microsoft.com<mailto:Robert.Brown@microsoft.com>> wrote: I'm also struggling with sections 3 & 4 - SpeechService and SpeechServiceQuery. Sorry for not chiming in earlier. While I can see the direction this is going, it just feels way too complicated to me. I think it will be a lot more work to iron out the details, but in the end won't make for an API that's easy to use. Personally I'd prefer to take a simplified approach. Something like this... Firstly, make it as easy as possible to use the built-in speech capabilities of a UA just by creating the SpeechInputRequest and SpeechOutputRequest objects, without any messing about with services and criteria and queries. Something like this: function simplestCase() { // just give me the default recognizer and synthesizer: simplestSR = new SpeechInputRequest(); simplestTTS = new SpeechOutputRequest(); } Secondly, for cases where the UA has access to variety of different speech engines, rather than create a Query API and a Criteria API, just provide mandatory parameters and optional parameters as strings in the constructors for SpeechInputRequest and SpeechOutputRequest. The constructor pattern would be something like this: [Constructor(DOMString? mandatoryparams, optional DOMString? optionalparams)] The usage would be something like this: function aLittleBitFussy() { // Give me a recognizer for Australian or British English, // with grammars for dictation and datetime. // It should preferably model a child's vocal tract, but doesn't need to. fussySR= new SpeechInputRequest("language=en-AU|en-GB;grammars=<builtin:dictation>,<builtin:datetime>", "age=child"); // Give me a synthesizer. It must be Swedish. // If the voice named "Kiana" is installed, please use it. // Otherwise, I'd prefer a voice that at least sounds like a woman in her thirties, if you have one. fussyTTS = new SpeechOutputRequest("language=sv-SE", "name=Kiana;gender=female;age=30-40"); } Thirdly, only use a SpeechService object for actual services that aren't built-in to the UA. In this case we should model existing WebSockets and XHR patterns to initialize the service, and then use the service object as a parameter to the constructors for SpeechInputRequest and SpeechOutputRequest. And drop the Query object entirely. Usage would be something like this: var ssvc; function initService() { //open a new service ssvc = new SpeechService("https://myspeechservice/?account=a84e-2198-4e60-00f3"); ssvc.onopen = function () { //check that it has the characteristics we expected... //will it recognize en-AU or en-GB, and speak Swedish? if ((ssvc.getSupportedLanguages("recognition", "en-AU,en-GB") == '') || (ssvc.getSupportedLanguages("synthesis", "en-AU,en-GB") == '') //does it have the right grammars? || (ssvc.getSupportedGrammars("<builtin:dictation>,<builtin:us-cities>") == '')) { //no? okay, close it - we don't want it ssvc.close(); ssvc.onclose = function () { ssvc = null; } return; } } //get SR and TTS request objects using the service: serviceSR = new SpeechInputRequest(ssvc); serviceTTS = new SpeechOutputRequest(ssvc); }
Received on Thursday, 15 September 2011 00:34:34 UTC