W3C home > Mailing lists > Public > public-speech-api@w3.org > May 2012

RE: Confidence property

From: Young, Milan <Milan.Young@nuance.com>
Date: Wed, 23 May 2012 22:56:15 +0000
To: Satish S <satish@google.com>
CC: "public-speech-api@w3.org" <public-speech-api@w3.org>
Message-ID: <B236B24082A4094A85003E8FFB8DDC3C1A45C988@SOM-EXCH04.nuance.com>
>> The benefit of minimizing deaf periods is therefore again recognizer specific

Most (all?) of the recognition engines which can be embedded within an HTML browser currently operate over a network.  In fact if you study the use cases, you'd find that the majority of those transactions are over a 3G network which is notoriously latent.

It's possible that this may begin to change over the next few year, but it's surely not going to be in the lifetime of our 1.0 spec (at least I hope we can come to agreement before then :)).  Thus the problem can hardly be called engine specific.

Yes, the semantics are unclear, but that wouldn't be any different than a quasi-standard which would undoubtedly emerge in the absence of a specification.



From: Satish S [mailto:satish@google.com]
Sent: Wednesday, May 23, 2012 6:28 AM
To: Young, Milan
Cc: public-speech-api@w3.org
Subject: Re: Confidence property

Hi Milan,

Summarizing previous discussion, we have:
  Pros:  1) Aids efficient application design, 2) minimizes deaf periods, 3) avoids a proliferation of semi-standard custom parameters.
  Cons: 1) Semantics of the value are not precisely defined, and 2) Novice users may not understand how confidence differs from maxnbest.

My responses to the cons are: 1) Precedent from the speech industry, and 2) Thousands of VoiceXML developers do understand the difference and will balk at an API that does not accommodate their needs.

This was well debated in the earlier thread and it is clear that confidence threshold semantics are tied to the recognizer (not portable). The benefit of minimizing deaf periods is therefore again recognizer specific and not portable. This is a well suited use case for custom parameters and I'd suggest we start with that.

Thousands of VoiceXML developers do understand the difference and will balk at an API that does not accommodate their needs.

I hope we aren't trying to replicate VoiceXML in the browser. If it is indeed a must have feature for web developers we'll be receiving requests for it from them very soon, so it would be easy to discuss and add it in future.
Received on Wednesday, 23 May 2012 22:56:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 23 May 2012 22:56:49 GMT