Re: stabilityThreshold attribute from Glen Shires on 2012-09-05 (public-speech-api@w3.org from September 2012)

From: Glen Shires <gshires@google.com>
Date: Tue, 4 Sep 2012 17:03:37 -0700
To: "Young, Milan" <Milan.Young@nuance.com>
Cc: Satish S <satish@google.com>, "public-speech-api@w3.org" <public-speech-api@w3.org>
Message-ID: <CAEE5bcgQcvomVD92oWqH6qhKxV6stbLQVBH9DrH1YL-SLPtkuQ@mail.gmail.com>
Agreed, here's the new proposed wording:

Add to IDL for SpeechRecognition (the top level)
    attribute boolean interimResults;

Add to 5.1.1 Speech Recognition Attributes definition:

interimResults
  Controls whether interim results (that is, results with
SpeechRecognitionResult.final == false), are returned.  When set to true,
interim results SHOULD be returned. When set to false, interim results MUST
NOT be returned. The default value is false. Note, this parameter setting
does not affect final results (that is, results with
SpeechRecognitionResult.final == true).


On Tue, Sep 4, 2012 at 4:25 PM, Young, Milan <Milan.Young@nuance.com> wrote:

>  I’ll support this.  My only nit is the wording you used below “Note, independent
> of the setting of this attribute, final results (results with
> SpeechRecognitionResult.final == true) MUST be returned.”****
>
> ** **
>
> The problem is that you are requiring the recognizer to return a plural of
> final results, but it may not be able to return even one in the case of a
> noinput or nomatch.  I suggest we reword this to be: “Note, this parameter
> setting does not affect final results (ie results with
> SpeechRecognitionResult.final == true).”****
>
> ** **
>
> Thanks****
>
> ** **
>
> ** **
>
> *From:* Glen Shires [mailto:gshires@google.com]
> *Sent:* Tuesday, September 04, 2012 3:51 PM
> *To:* Young, Milan
> *Cc:* Satish S; public-speech-api@w3.org
> *Subject:* Re: stabilityThreshold attribute****
>
> ** **
>
> Yes, it's challenging to precisely define stability in a
> recognizer-independent manner. Therefore, I agree, for our initial draft,
> that we should instead define a boolean flag that turns on/off interim
> results.****
>
> ** **
>
> So instead of my initial proposal, I propose the following:****
>
> ** **
>
> Add to IDL for SpeechRecognition (the top level)****
>
>     attribute boolean interimResults;****
>
> ** **
>
> Add to 5.1.1 Speech Recognition Attributes definition:****
>
> ** **
>
> interimResults****
>
>   Controls whether interim results (that is, results with
> SpeechRecognitionResult.final == false), are returned.  When set to true,
> interim results SHOULD be returned. When set to false, interim results MUST
> NOT be returned. The default value is false. Note, independent of the
> setting of this attribute, final results (results with
> SpeechRecognitionResult.final == true) MUST be returned.****
>
> ** **
>
> ** **
>
> ** **
>
> Notes:****
>
> - I believe the above use of SHOULD and MUST reflects the rest of the
> spec. That is, I don't believe the spec requires that interim results be
> returned.****
>
> - If you have a better name for this flag, please propose it. The word
> "interim" is not used in the IDL (only in the definitions), but naming this
> flag something like "onlyFinalResults" seems more confusing.****
>
> ** **
>
> /Glen Shires****
>
>  ****
>
> On Tue, Sep 4, 2012 at 10:06 AM, Young, Milan <Milan.Young@nuance.com>
> wrote:****
>
> I’m happy to support a flag that turns on/off interim results.  But a
> greyscale parameter needs more discussion.  In particular, I’d like to
> better understand how this interacts with confidence (which itself still
> needs to be defined).****
>
>  ****
>
>  ****
>
>  ****
>
> *From:* Satish S [mailto:satish@google.com]
> *Sent:* Monday, September 03, 2012 7:34 AM
> *To:* Glen Shires
> *Cc:* public-speech-api@w3.org
> *Subject:* Re: stabilityThreshold attribute****
>
>  ****
>
> Makes sense. I missed the part where the default value was 1.0 i.e. only
> final results are sent to the web app by default. That makes sense and I
> agree it is difficult to add this in future in a performant fashion.****
>
>
> Cheers
> Satish****
>
> On Mon, Sep 3, 2012 at 5:29 AM, Glen Shires <gshires@google.com> wrote:***
> *
>
> I suspect that many JavaScript authors will only use final results, and
> not make use of interim results. This keeps their code simple, also they
> wouldn't need to set stabilityThreshold because the default is to NOT
> include interim results.****
>
>  ****
>
> Authors that choose to write the extra code to handle interim results
> would also set stabilityThreshold to a value (other than the default).****
>
>  ****
>
> Adding stabilityThreshold after v1 would be messy/inefficient because that
> would mean that interim results are always sent in v1.  Thus, to avoid
> changing the default behavior, if stabilityThreshold was added after v1,
> then the default would have to be 0.0 instead of 1.0.  Thus, the default
> would be to always use the most bandwidth and compute, rather than the
> least.  I suspect that JavaScript authors that only use final results and
> keep their code simple, may not bother to set stabilityThreshold
> (especially because they need to version check to see if the feature is
> available). Thus, this simple case will typically waste bandwidth and
> compute, even if we add this feature after v1.  Therefore, I believe we
> should put this in the spec for v1.****
>
>  ****
>
> This proposal is flexible enough that a UA/speech-recognizer could
> implement fine-grain stability. Conversely, a UA may use a trivial
> implementation and be conformant, where all interim results are assigned
> the same stability value (perhaps 0.5 for example), and stabilityThreshold
> controls only whether no interim results, or all interim results are
> returned.****
>
>  ****
>
> /Glen Shires****
>
>  ****
>
>  ****
>
> On Sun, Sep 2, 2012 at 2:27 PM, Satish S <satish@google.com> wrote:****
>
> I mentioned something similar for confidenceThreshold as well - this is
> really an optimisation for the speech service that most web developers
> would choose to ignore. Can this be added if there is a real need reported
> by developers after v1?****
>
>
> Cheers
> Satish****
>
> ** **
>
> On Sat, Sep 1, 2012 at 3:51 PM, Glen Shires <gshires@google.com> wrote:***
> *
>
> If there's no disagreement, I'll add this to the spec on Tuesday...****
>
>  ****
>
> On Thu, Aug 30, 2012 at 10:55 AM, Glen Shires <gshires@google.com> wrote:*
> ***
>
> For JavaScript authors that do not make use of interim results, or only
> want to show fewer and more stable interim results, the following
> stabilityThreshold would allow the author to request this, which reduces
> the bandwidth usage and the events fired (and thus also reduces compute and
> power).  In addition, the stability attribute returned with interim results
> would allow authors to process results according to their estimated
> stability. For example, an author might choose to display final results in
> black, fairly stable results in dark grey, and not very stable results in
> light grey.****
>
>  ****
>
> I propose the following:****
>
>  ****
>
> Add to IDL for SpeechRecognitionResult****
>
>     readonly attribute float stability;****
>
>  ****
>
> Add to 5.1.6 Speech Recognition Result definitions****
>
>  ****
>
> stability****
>
>   The stability represents a numeric estimate between 0.0 and 1.0 of how
> likely the recognition system is to change this interim result. A higher
> number indicates the result is less likely to change.  This attribute is
> not defined when the "final" attribute is true.****
>
>  ****
>
> Add to IDL for SpeechRecognition (the top level)****
>
>     attribute float stabilityThreshold;****
>
>  ****
>
> Add to 5.1.1 Speech Recognition Attributes definitions:****
>
>  ****
>
> stabilityThreshold****
>
>   This attribute controls how many interim results are returned. When set
> to the value of 1.0, no interim results (only final results) will be
> returned.  When set to 0.0, all interim results will be returned. Valid
> values are in the range of 0.0 to 1.0 inclusive. The default value is 1.0.
> ****
>
>  ****
>
> /Glen Shires****
>
>  ****
>
>  ****
>
>  ****
>
>  ****
>
> ** **
>
Received on Wednesday, 5 September 2012 00:04:48 UTC