- From: Deborah Dahl <dahl@conversational-technologies.com>
- Date: Tue, 24 Apr 2012 17:57:36 -0400
- To: "'Young, Milan'" <Milan.Young@nuance.com>, "'Hans Wennborg'" <hwennborg@google.com>, "'Satish S'" <satish@google.com>
- Cc: <public-speech-api@w3.org>
While the semantics of recognizer confidence does vary in different recognizers, a numeric value is more useful than a rank order. For example, if the top two alternatives in the nbest have very similar confidences, a dialog manager might decide to reprompt the user, or display both alternatives and let the user pick one, but not if the top candidate has a much higher confidence than the second candidate. > -----Original Message----- > From: Young, Milan [mailto:Milan.Young@nuance.com] > Sent: Tuesday, April 24, 2012 12:23 PM > To: Hans Wennborg; Satish S > Cc: public-speech-api@w3.org > Subject: RE: Additional parameters to SpeechRecognition (was "Speech API: > first editor's draft posted") > > There are two reasons for including confidence that I would like this > community to consider: > Efficiency - Similar to the argument Satish put forward for limiting the size of > the nbest array, pruning the result candidates at the server is more efficient. > Clipping - There are many environments where background noise and side > speech that can trigger junk results. If confidence is low, this will trigger a > result and then the application enters a deaf period where it processes the > result and discovers the content is junk. If real speech happens during this > phase, its start will be missed. > > Every recognizer that was ever invented has a concept of confidence. Yes, > the semantics of that value vary across platforms, but for us to push this to a > custom parameter will confuse developers, and ultimately slow adoption. > > > Regarding the timeout family, an open-ended dialog like "Tell me what is > wrong with your computer", should have generous timeouts. Compare this > to "So it's something to do with your new Google double mouse > configuration, is that correct?" which should have short timeouts. > > Our goal should be a consistent application experience across UAs, and that's > only going to happen if we standardize timeouts. I would also like to mention > that the definition of these timeouts is clear and has been industry standard > for 10+ years. > > Thanks > > > > -----Original Message----- > From: Hans Wennborg [mailto:hwennborg@google.com] > Sent: Tuesday, April 24, 2012 8:25 AM > To: Satish S > Cc: public-speech-api@w3.org > Subject: Re: Additional parameters to SpeechRecognition (was "Speech API: > first editor's draft posted") > > On Tue, Apr 24, 2012 at 14:52, Satish S <satish@google.com> wrote: > > (Splitting off to a new thread so we can follow discussions easily. > > Please start a new threads for proposed additions/changes) > > > >> SpeechRecognition > >> - In addition to the three parameters you have listed, I see the following > as necessary: > >> integer maxNBest; > > > > I can see speech engines defaulting to a specific number of results > > and the web app can tweak it based on performance characteristics it > > needs. Without this attribute the engine should be asked to always > > give the max number of results and let the web app filter, which seems > > suboptimal. > > I agree, I think this would be a good addition. > > >> float confidenceThreshold; > > > > SpeechRecognitionAlternative.confidence provides the value so the web > > app can filter based on that if it needs to. With that in mind do we > > need this attribute? > > Agreed. Also, the absolute confidence values are probably not very > interesting. For example, what does a confidence of 0.5 mean? I see the > confidence values as useful for providing an ordering of the alternatives, not > much else. > > >> integer completeTimeout; > >> integer incompleteTimeout; > >> integer maxSpeechTimeout; > > > > Do you have use cases where these should vary between different web > > apps? I think it would be better to leave it to the UA so all web apps > > have consistent timeouts and user expectation doesn't get affected. > > I don't like the idea of having three different timeouts. Couldn't the web > page handle timeouts itself, by calling abort() on the SpeechRecognition > object if it takes too long? > > Thanks, > Hans >
Received on Tuesday, 24 April 2012 22:01:10 UTC