Re: Additional parameters to SpeechRecognition (was "Speech API: first editor's draft posted") from Glen Shires on 2012-04-25 (public-speech-api@w3.org from April 2012)

From: Glen Shires <gshires@google.com>
Date: Wed, 25 Apr 2012 08:10:33 -0700
To: Hans Wennborg <hwennborg@google.com>
Cc: "Young, Milan" <Milan.Young@nuance.com>, Satish S <satish@google.com>, "public-speech-api@w3.org" <public-speech-api@w3.org>
Message-ID: <CAEE5bcjFDeigK98rg9j6j_pw1fobycsnQgQwc7DKgrLJs-L3nA@mail.gmail.com>

confidenceThreshold

I think we all agree that speech recognizers have a concept of confidence,
and that it can be mapped to a monotonically increasing range of 0.0 to
1.0.  However, specific values (for example 0.5) do not correspond to the
same level of confidence for different recognizers.

I believe that if the developer does not set the confidenceThreshold, the
speech recognizer should use a default value that is appropriate for that
recognizer.

A complication with a confidenceThreshold attribute is defining the default
value (if the value is read, but not written, what value does the BROWSER
return? - particularly because the optimal default value may vary from one
RECOGNIZER to another).

Perhaps instead of an attribute, this should be a write-only value,
specifically a setConfidenceThreshold method.

/Glen Shires

On Wed, Apr 25, 2012 at 6:43 AM, Hans Wennborg <hwennborg@google.com> wrote:

> On Tue, Apr 24, 2012 at 17:22, Young, Milan <Milan.Young@nuance.com>
> wrote:
> > There are two reasons for including confidence that I would like this
> community to consider:
> >  Efficiency - Similar to the argument Satish put forward for limiting
> the size of the nbest array, pruning the result candidates at the server is
> more efficient.
> >  Clipping - There are many environments where background noise and side
> speech that can trigger junk results.  If confidence is low, this will
> trigger a result and then the application enters a deaf period where it
> processes the result and discovers the content is junk.  If real speech
> happens during this phase, its start will be missed.
> >
> > Every recognizer that was ever invented has a concept of confidence.
>  Yes, the semantics of that value vary across platforms, but for us to push
> this to a custom parameter will confuse developers, and ultimately slow
> adoption.
>
> Ok, I don't feel strongly about this, so I would be fine adding a
> confidenceThreshold if others agree.
>
> > Regarding the timeout family, an open-ended dialog like "Tell me what is
> wrong with your computer", should have generous timeouts.  Compare this to
> "So it's something to do with your new Google double mouse configuration,
> is that correct?" which should have short timeouts.
> >
> > Our goal should be a consistent application experience across UAs, and
> that's only going to happen if we standardize timeouts.  I would also like
> to mention that the definition of these timeouts is clear and has been
> industry standard for 10+ years.
>
> What do you think about my idea of just letting the web page handle
> the timeout itself, calling abort() when it decides a request is
> taking too long?
>
>
> Thanks,
> Hans
>
>

-- 
Thanks!
Glen Shires

Received on Wednesday, 25 April 2012 15:11:47 UTC