- From: Robert Brown <Robert.Brown@microsoft.com>
- Date: Wed, 14 Sep 2011 01:01:26 +0000
- To: Satish S <satish@google.com>, "olli@pettay.fi" <olli@pettay.fi>, "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
- Message-ID: <113BCF28740AF44989BE7D3F84AE18DD1B2565E4@TK5EX14MBXC112.redmond.corp.microsoft.>
I'm also struggling with sections 3 & 4 - SpeechService and SpeechServiceQuery.
Sorry for not chiming in earlier. While I can see the direction this is going, it just feels way too complicated to me. I think it will be a lot more work to iron out the details, but in the end won't make for an API that's easy to use.
Personally I'd prefer to take a simplified approach. Something like this...
Firstly, make it as easy as possible to use the built-in speech capabilities of a UA just by creating the SpeechInputRequest and SpeechOutputRequest objects, without any messing about with services and criteria and queries. Something like this:
function simplestCase() {
// just give me the default recognizer and synthesizer:
simplestSR = new SpeechInputRequest();
simplestTTS = new SpeechOutputRequest();
}
Secondly, for cases where the UA has access to variety of different speech engines, rather than create a Query API and a Criteria API, just provide mandatory parameters and optional parameters as strings in the constructors for SpeechInputRequest and SpeechOutputRequest.
The constructor pattern would be something like this:
[Constructor(DOMString? mandatoryparams, optional DOMString? optionalparams)]
The usage would be something like this:
function aLittleBitFussy() {
// Give me a recognizer for Australian or British English,
// with grammars for dictation and datetime.
// It should preferably model a child's vocal tract, but doesn't need to.
fussySR= new SpeechInputRequest("language=en-AU|en-GB;grammars=<builtin:dictation>,<builtin:datetime>",
"age=child");
// Give me a synthesizer. It must be Swedish.
// If the voice named "Kiana" is installed, please use it.
// Otherwise, I'd prefer a voice that at least sounds like a woman in her thirties, if you have one.
fussyTTS = new SpeechOutputRequest("language=sv-SE",
"name=Kiana;gender=female;age=30-40");
}
Thirdly, only use a SpeechService object for actual services that aren't built-in to the UA. In this case we should model existing WebSockets and XHR patterns to initialize the service, and then use the service object as a parameter to the constructors for SpeechInputRequest and SpeechOutputRequest. And drop the Query object entirely.
Usage would be something like this:
var ssvc;
function initService() {
//open a new service
ssvc = new SpeechService("https://myspeechservice/?account=a84e-2198-4e60-00f3");
ssvc.onopen = function () {
//check that it has the characteristics we expected...
//will it recognize en-AU or en-GB, and speak Swedish?
if ((ssvc.getSupportedLanguages("recognition", "en-AU,en-GB") == '')
|| (ssvc.getSupportedLanguages("synthesis", "en-AU,en-GB") == '')
//does it have the right grammars?
|| (ssvc.getSupportedGrammars("<builtin:dictation>,<builtin:us-cities>") == '')) {
//no? okay, close it - we don't want it
ssvc.close();
ssvc.onclose = function () {
ssvc = null;
}
return;
}
}
//get SR and TTS request objects using the service:
serviceSR = new SpeechInputRequest(ssvc);
serviceTTS = new SpeechOutputRequest(ssvc);
}
Received on Wednesday, 14 September 2011 01:01:58 UTC