- From: Satish S <satish@google.com>
- Date: Fri, 4 May 2012 16:40:06 +0100
- To: "Young, Milan" <Milan.Young@nuance.com>
- Cc: Jerry Carter <jerry@jerrycarter.org>, "public-speech-api@w3.org" <public-speech-api@w3.org>
- Message-ID: <CAHZf7RkbZBojEpBGYpvKs6muzdOKC1FEontsKWS9Q+YA9=wZmQ@mail.gmail.com>
> > But “confidence” is a much easier to understand concept, and I don’t see > any harm to the average web developer by including it in the list. > FWIW, that shouldn't be the bar to include items in the API. Since web APIs are supported perpetually in practice we should start with the most basic set and iterate based on concrete application requirements. One example is “hotword” recognition, which might be used to wake up the > application after long periods of silence, side speech, noise, etc. The > hotword grammar is often very simple (eg “wake up”), and thus multiple > interpretations are extremely uncommon. Developers would use “confidence” > to avoid false positives which consume processing resources and induce the > deaf periods I mentioned before. I can see the same use case addressed by setting maxNBest=1 so that only the topmost interpretation is returned and the engine optimises resources for that. I am also wondering if optimising for server side performance should even be a consideration when designing the web speech API. Developing a simple web developer facing API is our explicit goal and optimisation is something that implementors of both UAs and speech engines would do based on a lot of parameters, hence the API should not really care about it. I’m not sure what it means in practice to not define a confidenceThreshold > (option 4). Doesn’t it just mean that recognizer behavior is > implementation-specific, and isn’t that equivalent to option (2)? Isn’t > (4) subject to the same problems when changing recognizers as (2)? Yes I think (2) and (4) are the same because the actual custom parameters aren't going to be defined in the spec.
Received on Friday, 4 May 2012 15:40:36 UTC