Re: Additional parameters to SpeechRecognition (was "Speech API: first editor's draft posted")

>
> But “confidence” is a much easier to understand concept, and I don’t see
> any harm to the average web developer by including it in the list.
>

FWIW, that shouldn't be the bar to include items in the API. Since web APIs
are supported perpetually in practice we should start with the most basic
set and iterate based on concrete application requirements.

One example is “hotword” recognition, which might be used to wake up the
> application after long periods of silence, side speech, noise, etc.  The
> hotword grammar is often very simple (eg “wake up”), and thus multiple
> interpretations are extremely uncommon.  Developers would use “confidence”
> to avoid false positives which consume processing resources and induce the
> deaf periods I mentioned before.


I can see the same use case addressed by setting maxNBest=1 so that only
the topmost interpretation is returned and the engine optimises resources
for that.

I am also wondering if optimising for server side performance should even
be a consideration when designing the web speech API. Developing a simple
web developer facing API is our explicit goal and optimisation is something
that implementors of both UAs and speech engines would do based on a lot of
parameters, hence the API should not really care about it.

I’m not sure what it means in practice to not define a confidenceThreshold
> (option 4). Doesn’t it just mean that recognizer behavior is
> implementation-specific, and isn’t that equivalent to option (2)? Isn’t
> (4) subject to the same problems when changing recognizers as (2)?


 Yes I think (2) and (4) are the same because the actual custom parameters
aren't going to be defined in the spec.

Received on Friday, 4 May 2012 15:40:36 UTC