- From: Young, Milan <Milan.Young@nuance.com>
- Date: Tue, 4 Sep 2012 23:25:24 +0000
- To: Glen Shires <gshires@google.com>
- CC: Satish S <satish@google.com>, "public-speech-api@w3.org" <public-speech-api@w3.org>
- Message-ID: <B236B24082A4094A85003E8FFB8DDC3C1A49E001@SOM-EXCH04.nuance.com>
I'll support this. My only nit is the wording you used below "Note, independent of the setting of this attribute, final results (results with SpeechRecognitionResult.final == true) MUST be returned." The problem is that you are requiring the recognizer to return a plural of final results, but it may not be able to return even one in the case of a noinput or nomatch. I suggest we reword this to be: "Note, this parameter setting does not affect final results (ie results with SpeechRecognitionResult.final == true)." Thanks From: Glen Shires [mailto:gshires@google.com] Sent: Tuesday, September 04, 2012 3:51 PM To: Young, Milan Cc: Satish S; public-speech-api@w3.org Subject: Re: stabilityThreshold attribute Yes, it's challenging to precisely define stability in a recognizer-independent manner. Therefore, I agree, for our initial draft, that we should instead define a boolean flag that turns on/off interim results. So instead of my initial proposal, I propose the following: Add to IDL for SpeechRecognition (the top level) attribute boolean interimResults; Add to 5.1.1 Speech Recognition Attributes definition: interimResults Controls whether interim results (that is, results with SpeechRecognitionResult.final == false), are returned. When set to true, interim results SHOULD be returned. When set to false, interim results MUST NOT be returned. The default value is false. Note, independent of the setting of this attribute, final results (results with SpeechRecognitionResult.final == true) MUST be returned. Notes: - I believe the above use of SHOULD and MUST reflects the rest of the spec. That is, I don't believe the spec requires that interim results be returned. - If you have a better name for this flag, please propose it. The word "interim" is not used in the IDL (only in the definitions), but naming this flag something like "onlyFinalResults" seems more confusing. /Glen Shires On Tue, Sep 4, 2012 at 10:06 AM, Young, Milan <Milan.Young@nuance.com<mailto:Milan.Young@nuance.com>> wrote: I'm happy to support a flag that turns on/off interim results. But a greyscale parameter needs more discussion. In particular, I'd like to better understand how this interacts with confidence (which itself still needs to be defined). From: Satish S [mailto:satish@google.com<mailto:satish@google.com>] Sent: Monday, September 03, 2012 7:34 AM To: Glen Shires Cc: public-speech-api@w3.org<mailto:public-speech-api@w3.org> Subject: Re: stabilityThreshold attribute Makes sense. I missed the part where the default value was 1.0 i.e. only final results are sent to the web app by default. That makes sense and I agree it is difficult to add this in future in a performant fashion. Cheers Satish On Mon, Sep 3, 2012 at 5:29 AM, Glen Shires <gshires@google.com<mailto:gshires@google.com>> wrote: I suspect that many JavaScript authors will only use final results, and not make use of interim results. This keeps their code simple, also they wouldn't need to set stabilityThreshold because the default is to NOT include interim results. Authors that choose to write the extra code to handle interim results would also set stabilityThreshold to a value (other than the default). Adding stabilityThreshold after v1 would be messy/inefficient because that would mean that interim results are always sent in v1. Thus, to avoid changing the default behavior, if stabilityThreshold was added after v1, then the default would have to be 0.0 instead of 1.0. Thus, the default would be to always use the most bandwidth and compute, rather than the least. I suspect that JavaScript authors that only use final results and keep their code simple, may not bother to set stabilityThreshold (especially because they need to version check to see if the feature is available). Thus, this simple case will typically waste bandwidth and compute, even if we add this feature after v1. Therefore, I believe we should put this in the spec for v1. This proposal is flexible enough that a UA/speech-recognizer could implement fine-grain stability. Conversely, a UA may use a trivial implementation and be conformant, where all interim results are assigned the same stability value (perhaps 0.5 for example), and stabilityThreshold controls only whether no interim results, or all interim results are returned. /Glen Shires On Sun, Sep 2, 2012 at 2:27 PM, Satish S <satish@google.com<mailto:satish@google.com>> wrote: I mentioned something similar for confidenceThreshold as well - this is really an optimisation for the speech service that most web developers would choose to ignore. Can this be added if there is a real need reported by developers after v1? Cheers Satish On Sat, Sep 1, 2012 at 3:51 PM, Glen Shires <gshires@google.com<mailto:gshires@google.com>> wrote: If there's no disagreement, I'll add this to the spec on Tuesday... On Thu, Aug 30, 2012 at 10:55 AM, Glen Shires <gshires@google.com<mailto:gshires@google.com>> wrote: For JavaScript authors that do not make use of interim results, or only want to show fewer and more stable interim results, the following stabilityThreshold would allow the author to request this, which reduces the bandwidth usage and the events fired (and thus also reduces compute and power). In addition, the stability attribute returned with interim results would allow authors to process results according to their estimated stability. For example, an author might choose to display final results in black, fairly stable results in dark grey, and not very stable results in light grey. I propose the following: Add to IDL for SpeechRecognitionResult readonly attribute float stability; Add to 5.1.6 Speech Recognition Result definitions stability The stability represents a numeric estimate between 0.0 and 1.0 of how likely the recognition system is to change this interim result. A higher number indicates the result is less likely to change. This attribute is not defined when the "final" attribute is true. Add to IDL for SpeechRecognition (the top level) attribute float stabilityThreshold; Add to 5.1.1 Speech Recognition Attributes definitions: stabilityThreshold This attribute controls how many interim results are returned. When set to the value of 1.0, no interim results (only final results) will be returned. When set to 0.0, all interim results will be returned. Valid values are in the range of 0.0 to 1.0 inclusive. The default value is 1.0. /Glen Shires
Received on Tuesday, 4 September 2012 23:25:54 UTC