- From: Satish S <satish@google.com>
- Date: Wed, 21 Sep 2011 00:15:30 +0100
- To: public-xg-htmlspeech@w3.org
- Message-ID: <CAHZf7R=aj7jHByWUbPMgq9RDp-0edUSLm_AL1vP+CLP9ZfWbsw@mail.gmail.com>
As discussed in last week's call, I have drafted an IDL for SpeechInputRequest and some examples. Please review them below. Some key differences from what was discussed: 1. Since we wanted to have .start() automatically call .init() if not done already, the .init() call needs to not take any parameters so it can be invoked behind the scenes. 2. We talked about similarities with XHR so I looked up the latest XHR2 draft. There is a preference to do away with a single onreadystatechange style callback and split it into separate events. This also matches the above requirement of a .init() call with no parameters, so all event handlers are attributes in the IDL. 3. I added a 'continuous' boolean attribute as it seemed missing in the draft doc and there wasn't any way specified to request one-shot or continuous recognition. 4. I added a 'filterOffensiveWords' boolean attribute as it came across as a necessary feature in real world applications (when we tested voice search on the Google homepage). Some questions: 1. The 'saveWaveformURI' and 'inputWaveformURI' attributes are a bit troubling. This will require us probably specify codecs to support, whether the UA should transcode in case the input waveform doesn't match what the speech service accepts, same origin policies and so on. Given the few weeks we have remaining, is this a strong use case for us to look into or can we remove it? 2. The 'saveForRereco' usage and API is unclear at the moment. Has anyone given thought more about it? IDL: [Constructor] interface SpeechInputService : EventTarget { // attributes related to connection with speech service attribute DOMString uri; attribute DOMString saveWaveformURI; // usage? codecs and same origin policies? attribute DOMString inputWaveformURI; // again, codecs? should UA reencode? attribute MediaStream input; // attributes related to speech reco attribute DOMString[] languages; attribute DOMString[] grammars; attribute DOMStringMap customParameters; attribute int maxNBest; attribute boolean continuous; // was missing earlier? attribute boolean filterOffensiveWords; // added new, useful in real world context attribute boolean saveForRereco; // usage? attribute boolean localEndpointer; // renamed from the earlier 'setendpointdetection' method attribute boolean finalizeBeforeEnd; attribute boolean interimResults; attribute int interimResultsFreq; attribute float confidenceThreshold; attribute float sensitivity; attribute float speedVersusAccuracy; attribute int completeTimeout; attribute int incompleteTimeout; attribute int maxSpeechTimeout; // methods void open(); void start(); void stop(); void abort(); // event handler IDL attributes attribute Function onopen; attribute Function onstart; attribute Function onend; attribute Function onresult; attribute Function onnomatch; attribute Function onerror; attribute Function onaudiostart; attribute Function onsoundstart; attribute Function onspeechstart; attribute Function onspeechend; attribute Function onsoundend; attribute Function onaudioend; } And a couple of examples, adapted from Robert's examples earlier: Example 1: function simplestCase() { // Just give me the default recognizer. var req = new SpeechInputRequest(); req.onresult = function(event) { // Do things with event.result }; req.start(); } Example 2: function aComplexWebapp() { // Give me a recognizer for Australian or British English, // with grammars for dictation and datetime. // It should preferably model a child's vocal tract, but doesn't need to. var req = new SpeechInputRequest(); req.languages = ['en-AU', 'en-GB']; req.grammars = ['<builtin:dictation>', '<builtin:datetime>']; req.customParameters['age'] = 'child'; req.continuous = true; // And I'm gonna listen forever... req.interimResults = true; req.interimResultsFreq = 1000; req.onresult = function(event) { // Do things with event.result }; req.onstart = function(event) { $('#status').text("I'm listening."); // Stop listening after a minute. window.setTimeout(function() { $('#status').text('Thank you, please try again.'); req.abort(); req = null; }, 60000); }; req.onopen = function(event) { req.start(); }; req.onerror = function(event) { $('#status').text('Sorry, no dice.'); }; $('#status').text('Connecting and loading giant grammars...'); req.open(); } Cheers Satish
Received on Tuesday, 20 September 2011 23:16:04 UTC