W3C home > Mailing lists > Public > public-xg-htmlspeech@w3.org > September 2011

Re: A few high level thoughts about the web api sections 3 and 4.

From: Satish S <satish@google.com>
Date: Wed, 14 Sep 2011 09:52:45 +0100
Message-ID: <CAHZf7Rky25-qQtMGcz+Ju+BT9erLUtRDjDhnQH15LQyPAorgaw@mail.gmail.com>
To: Robert Brown <Robert.Brown@microsoft.com>
Cc: "olli@pettay.fi" <olli@pettay.fi>, "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
Thanks Robert, that does look simpler. One change I'd suggest is to turn all
the grammar/language strings into array parameters and array properties in
true JS form. So

function aLittleBitFussy() {

    // Give me a recognizer for Australian or British English,****

    // with grammars for dictation and datetime.****

    // It should preferably model a child's vocal tract, but doesn't need
to.****
    fussySR= new SpeechInputRequest(["en-AU", "en-GB"],
["<builtin:dictation>","<builtin:datetime>"], ["age=child"]);

** **

    // Give me a synthesizer. It must be Swedish.****

    // If the voice named "Kiana" is installed, please use it.****

    // Otherwise, I'd prefer a voice that at least sounds like a woman in
her thirties, if you have one.****

    fussyTTS = new SpeechOutputRequest("sv-SE", false, 35, ["name=Kiana"]);*
*
}

------

// will it recognize en-AU or en-GB, and speak Swedish?
if ((ssvc.recognitionLanguages.indexOf("en-AU") == -1 ||
    ssvc.recognitionLanguages.indexOf("en-GB") == -1 ||
    ssvc.synthesisLanguages.indexOf("sv-SE") == -1 ||
    // does it have the right grammars?
    ssvc.grammars.indexOf("<builtin:dictation>") == -1 ||
    ssvc.grammars.indexOf("<builtin:us-cities>") == -1) {
  //no ? okay, close it - we don't want it.
  ...
}


Cheers
Satish


On Wed, Sep 14, 2011 at 2:01 AM, Robert Brown <Robert.Brown@microsoft.com>wrote:

>  Iím also struggling with sections 3 & 4 Ė SpeechService and
> SpeechServiceQuery. ****
>
> ** **
>
> Sorry for not chiming in earlier. While I can see the direction this is
> going, it just feels way too complicated to me. I think it will be a lot
> more work to iron out the details, but in the end wonít make for an API
> thatís easy to use.****
>
> ** **
>
> Personally Iíd prefer to take a simplified approach. Something like thisÖ*
> ***
>
> ** **
>
> Firstly, make it as easy as possible to use the built-in speech
> capabilities of a UA just by creating the SpeechInputRequest and
> SpeechOutputRequest objects, without any messing about with services and
> criteria and queries. Something like this:****
>
> ** **
>
>     function simplestCase() {****
>
>         // just give me the default recognizer and synthesizer:****
>
>         simplestSR = new SpeechInputRequest();****
>
>         simplestTTS = new SpeechOutputRequest();****
>
>     }****
>
> ** **
>
> Secondly, for cases where the UA has access to variety of different speech
> engines, rather than create a Query API and a Criteria API, just provide
> mandatory parameters and optional parameters as strings in the constructors
> for SpeechInputRequest and SpeechOutputRequest.****
>
> ** **
>
> The constructor pattern would be something like this:****
>
> ** **
>
> [Constructor(DOMString? mandatoryparams, optional DOMString?
> optionalparams)]****
>
> ** **
>
> The usage would be something like this:****
>
> ** **
>
>     function aLittleBitFussy() {****
>
>         // Give me a recognizer for Australian or British English, ****
>
>         // with grammars for dictation and datetime.****
>
>         // It should preferably model a child's vocal tract, but doesn't
> need to.****
>
>         fussySR= new SpeechInputRequest(
> "language=en-AU|en-GB;grammars=<builtin:dictation>,<builtin:datetime>", **
> **
>
>                                         "age=child");****
>
> ** **
>
>         // Give me a synthesizer. It must be Swedish.****
>
>         // If the voice named "Kiana" is installed, please use it.****
>
>         // Otherwise, I'd prefer a voice that at least sounds like a woman
> in her thirties, if you have one.****
>
>         fussyTTS = new SpeechOutputRequest("language=sv-SE",****
>
>
> "name=Kiana;gender=female;age=30-40");****
>
>     }****
>
> ** **
>
> Thirdly, only use a SpeechService object for actual services that arenít
> built-in to the UA. In this case we should model existing WebSockets and XHR
> patterns to initialize the service, and then use the service object as a
> parameter to the constructors for SpeechInputRequest and
> SpeechOutputRequest. And drop the Query object entirely.****
>
> ** **
>
> Usage would be something like this:****
>
> ** **
>
>     var ssvc;****
>
>     function initService() {****
>
>         //open a new service****
>
>         ssvc = new SpeechService("
> https://myspeechservice/?account=a84e-2198-4e60-00f3");****
>
>         ssvc.onopen = function () {****
>
>             //check that it has the characteristics we expected...****
>
> ** **
>
>             //will it recognize en-AU or en-GB, and speak Swedish?****
>
>             if ((ssvc.getSupportedLanguages("recognition", "en-AU,en-GB")
> == '')****
>
>             || (ssvc.getSupportedLanguages("synthesis", "en-AU,en-GB") ==
> '')****
>
>             //does it have the right grammars?****
>
>             || (ssvc.getSupportedGrammars(
> "<builtin:dictation>,<builtin:us-cities>") == '')) {****
>
>                 //no? okay, close it - we don't want it****
>
>                ssvc.close();****
>
>                 ssvc.onclose = function () {****
>
>                     ssvc = null;****
>
>                 }****
>
>                 return;****
>
>             }****
>
>         }****
>
> ** **
>
>         //get SR and TTS request objects using the service:****
>
>         serviceSR = new SpeechInputRequest(ssvc);****
>
>         serviceTTS = new SpeechOutputRequest(ssvc);****
>
>     }****
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
Received on Wednesday, 14 September 2011 08:53:12 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 14 September 2011 08:53:13 GMT