W3C home > Mailing lists > Public > public-speech-api@w3.org > April 2012

RE: Additional parameters to SpeechRecognition (was "Speech API: first editor's draft posted")

From: Young, Milan <Milan.Young@nuance.com>
Date: Wed, 25 Apr 2012 16:35:46 +0000
To: Hans Wennborg <hwennborg@google.com>
CC: Satish S <satish@google.com>, "public-speech-api@w3.org" <public-speech-api@w3.org>
Message-ID: <B236B24082A4094A85003E8FFB8DDC3C1A4568C3@SOM-EXCH04.nuance.com>
We may be able to model maxSpeechTimeout with an abort().  But that involves listening for the start of speech, setting a timer, faking a result, etc.  If we want to expedite the path to adoption, a parameter seems like the right way to proceed.

The case for complete/incomplete timeout parameters is clear.  These control how much trailing silence the recognizer will use in determining the end of speech.  You can't perform either function without running DSPs in JS.  Completetimeout is further complicated by the need for a JS recognizer.  Probably best just to expose the parameters :)

Thanks

-----Original Message-----
From: Hans Wennborg [mailto:hwennborg@google.com] 
Sent: Wednesday, April 25, 2012 6:43 AM
To: Young, Milan
Cc: Satish S; public-speech-api@w3.org
Subject: Re: Additional parameters to SpeechRecognition (was "Speech API: first editor's draft posted")

On Tue, Apr 24, 2012 at 17:22, Young, Milan <Milan.Young@nuance.com> wrote:
> There are two reasons for including confidence that I would like this community to consider:
>  Efficiency - Similar to the argument Satish put forward for limiting the size of the nbest array, pruning the result candidates at the server is more efficient.
>  Clipping - There are many environments where background noise and side speech that can trigger junk results.  If confidence is low, this will trigger a result and then the application enters a deaf period where it processes the result and discovers the content is junk.  If real speech happens during this phase, its start will be missed.
>
> Every recognizer that was ever invented has a concept of confidence.  Yes, the semantics of that value vary across platforms, but for us to push this to a custom parameter will confuse developers, and ultimately slow adoption.

Ok, I don't feel strongly about this, so I would be fine adding a confidenceThreshold if others agree.

> Regarding the timeout family, an open-ended dialog like "Tell me what is wrong with your computer", should have generous timeouts.  Compare this to "So it's something to do with your new Google double mouse configuration, is that correct?" which should have short timeouts.
>
> Our goal should be a consistent application experience across UAs, and that's only going to happen if we standardize timeouts.  I would also like to mention that the definition of these timeouts is clear and has been industry standard for 10+ years.

What do you think about my idea of just letting the web page handle the timeout itself, calling abort() when it decides a request is taking too long?


Thanks,
Hans
Received on Wednesday, 25 April 2012 16:36:38 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:27:22 UTC