Re: stabilityThreshold attribute from Glen Shires on 2012-09-07 (public-speech-api@w3.org from September 2012)

From: Glen Shires <gshires@google.com>
Date: Fri, 7 Sep 2012 01:03:49 -0700
To: "Young, Milan" <Milan.Young@nuance.com>
Cc: Satish S <satish@google.com>, "public-speech-api@w3.org" <public-speech-api@w3.org>
Message-ID: <CAEE5bcgFzv842h4N3Cx_TNPdA8nX_+cNmW=Ues8u9RLMtDNDdw@mail.gmail.com>
I've updated the spec with this change:
https://dvcs.w3.org/hg/speech-api/rev/d25fea0d029c

As always, the current draft spec is at:
http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html

On Tue, Sep 4, 2012 at 5:03 PM, Glen Shires <gshires@google.com> wrote:

> Agreed, here's the new proposed wording:
>
> Add to IDL for SpeechRecognition (the top level)
>     attribute boolean interimResults;
>
> Add to 5.1.1 Speech Recognition Attributes definition:
>
> interimResults
>   Controls whether interim results (that is, results with
> SpeechRecognitionResult.final == false), are returned.  When set to true,
> interim results SHOULD be returned. When set to false, interim results MUST
> NOT be returned. The default value is false. Note, this parameter setting
> does not affect final results (that is, results with
> SpeechRecognitionResult.final == true).
>
>
> On Tue, Sep 4, 2012 at 4:25 PM, Young, Milan <Milan.Young@nuance.com>wrote:
>
>>  I’ll support this.  My only nit is the wording you used below “Note, independent
>> of the setting of this attribute, final results (results with
>> SpeechRecognitionResult.final == true) MUST be returned.”****
>>
>> ** **
>>
>> The problem is that you are requiring the recognizer to return a plural
>> of final results, but it may not be able to return even one in the case of
>> a noinput or nomatch.  I suggest we reword this to be: “Note, this
>> parameter setting does not affect final results (ie results with
>> SpeechRecognitionResult.final == true).”****
>>
>> ** **
>>
>> Thanks****
>>
>> ** **
>>
>> ** **
>>
>> *From:* Glen Shires [mailto:gshires@google.com]
>> *Sent:* Tuesday, September 04, 2012 3:51 PM
>> *To:* Young, Milan
>> *Cc:* Satish S; public-speech-api@w3.org
>> *Subject:* Re: stabilityThreshold attribute****
>>
>> ** **
>>
>> Yes, it's challenging to precisely define stability in a
>> recognizer-independent manner. Therefore, I agree, for our initial draft,
>> that we should instead define a boolean flag that turns on/off interim
>> results.****
>>
>> ** **
>>
>> So instead of my initial proposal, I propose the following:****
>>
>> ** **
>>
>> Add to IDL for SpeechRecognition (the top level)****
>>
>>     attribute boolean interimResults;****
>>
>> ** **
>>
>> Add to 5.1.1 Speech Recognition Attributes definition:****
>>
>> ** **
>>
>> interimResults****
>>
>>   Controls whether interim results (that is, results with
>> SpeechRecognitionResult.final == false), are returned.  When set to true,
>> interim results SHOULD be returned. When set to false, interim results MUST
>> NOT be returned. The default value is false. Note, independent of the
>> setting of this attribute, final results (results with
>> SpeechRecognitionResult.final == true) MUST be returned.****
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> Notes:****
>>
>> - I believe the above use of SHOULD and MUST reflects the rest of the
>> spec. That is, I don't believe the spec requires that interim results be
>> returned.****
>>
>> - If you have a better name for this flag, please propose it. The word
>> "interim" is not used in the IDL (only in the definitions), but naming this
>> flag something like "onlyFinalResults" seems more confusing.****
>>
>> ** **
>>
>> /Glen Shires****
>>
>>  ****
>>
>> On Tue, Sep 4, 2012 at 10:06 AM, Young, Milan <Milan.Young@nuance.com>
>> wrote:****
>>
>> I’m happy to support a flag that turns on/off interim results.  But a
>> greyscale parameter needs more discussion.  In particular, I’d like to
>> better understand how this interacts with confidence (which itself still
>> needs to be defined).****
>>
>>  ****
>>
>>  ****
>>
>>  ****
>>
>> *From:* Satish S [mailto:satish@google.com]
>> *Sent:* Monday, September 03, 2012 7:34 AM
>> *To:* Glen Shires
>> *Cc:* public-speech-api@w3.org
>> *Subject:* Re: stabilityThreshold attribute****
>>
>>  ****
>>
>> Makes sense. I missed the part where the default value was 1.0 i.e. only
>> final results are sent to the web app by default. That makes sense and I
>> agree it is difficult to add this in future in a performant fashion.****
>>
>>
>> Cheers
>> Satish****
>>
>> On Mon, Sep 3, 2012 at 5:29 AM, Glen Shires <gshires@google.com> wrote:**
>> **
>>
>> I suspect that many JavaScript authors will only use final results, and
>> not make use of interim results. This keeps their code simple, also they
>> wouldn't need to set stabilityThreshold because the default is to NOT
>> include interim results.****
>>
>>  ****
>>
>> Authors that choose to write the extra code to handle interim results
>> would also set stabilityThreshold to a value (other than the default).***
>> *
>>
>>  ****
>>
>> Adding stabilityThreshold after v1 would be messy/inefficient because
>> that would mean that interim results are always sent in v1.  Thus, to avoid
>> changing the default behavior, if stabilityThreshold was added after v1,
>> then the default would have to be 0.0 instead of 1.0.  Thus, the default
>> would be to always use the most bandwidth and compute, rather than the
>> least.  I suspect that JavaScript authors that only use final results and
>> keep their code simple, may not bother to set stabilityThreshold
>> (especially because they need to version check to see if the feature is
>> available). Thus, this simple case will typically waste bandwidth and
>> compute, even if we add this feature after v1.  Therefore, I believe we
>> should put this in the spec for v1.****
>>
>>  ****
>>
>> This proposal is flexible enough that a UA/speech-recognizer could
>> implement fine-grain stability. Conversely, a UA may use a trivial
>> implementation and be conformant, where all interim results are assigned
>> the same stability value (perhaps 0.5 for example), and stabilityThreshold
>> controls only whether no interim results, or all interim results are
>> returned.****
>>
>>  ****
>>
>> /Glen Shires****
>>
>>  ****
>>
>>  ****
>>
>> On Sun, Sep 2, 2012 at 2:27 PM, Satish S <satish@google.com> wrote:****
>>
>> I mentioned something similar for confidenceThreshold as well - this is
>> really an optimisation for the speech service that most web developers
>> would choose to ignore. Can this be added if there is a real need reported
>> by developers after v1?****
>>
>>
>> Cheers
>> Satish****
>>
>> ** **
>>
>> On Sat, Sep 1, 2012 at 3:51 PM, Glen Shires <gshires@google.com> wrote:**
>> **
>>
>> If there's no disagreement, I'll add this to the spec on Tuesday...****
>>
>>  ****
>>
>> On Thu, Aug 30, 2012 at 10:55 AM, Glen Shires <gshires@google.com> wrote:
>> ****
>>
>> For JavaScript authors that do not make use of interim results, or only
>> want to show fewer and more stable interim results, the following
>> stabilityThreshold would allow the author to request this, which reduces
>> the bandwidth usage and the events fired (and thus also reduces compute and
>> power).  In addition, the stability attribute returned with interim results
>> would allow authors to process results according to their estimated
>> stability. For example, an author might choose to display final results in
>> black, fairly stable results in dark grey, and not very stable results in
>> light grey.****
>>
>>  ****
>>
>> I propose the following:****
>>
>>  ****
>>
>> Add to IDL for SpeechRecognitionResult****
>>
>>     readonly attribute float stability;****
>>
>>  ****
>>
>> Add to 5.1.6 Speech Recognition Result definitions****
>>
>>  ****
>>
>> stability****
>>
>>   The stability represents a numeric estimate between 0.0 and 1.0 of how
>> likely the recognition system is to change this interim result. A higher
>> number indicates the result is less likely to change.  This attribute is
>> not defined when the "final" attribute is true.****
>>
>>  ****
>>
>> Add to IDL for SpeechRecognition (the top level)****
>>
>>     attribute float stabilityThreshold;****
>>
>>  ****
>>
>> Add to 5.1.1 Speech Recognition Attributes definitions:****
>>
>>  ****
>>
>> stabilityThreshold****
>>
>>   This attribute controls how many interim results are returned. When set
>> to the value of 1.0, no interim results (only final results) will be
>> returned.  When set to 0.0, all interim results will be returned. Valid
>> values are in the range of 0.0 to 1.0 inclusive. The default value is 1.0.
>> ****
>>
>>  ****
>>
>> /Glen Shires****
>>
>>  ****
>>
>>  ****
>>
>>  ****
>>
>>  ****
>>
>> ** **
>>
>
>
Received on Friday, 7 September 2012 08:05:04 UTC