Re: SpeechRecognitionEvent resultIndex / resultHistory

Looks good to me too.

On Fri, Aug 24, 2012 at 4:43 PM, Young, Milan <Milan.Young@nuance.com> wrote:
> I’ll go with that.
>
>
>
> From: Glen Shires [mailto:gshires@google.com]
> Sent: Friday, August 24, 2012 7:58 AM
>
>
> To: Young, Milan
> Cc: public-speech-api@w3.org
> Subject: Re: SpeechRecognitionEvent resultIndex / resultHistory
>
>
>
> Here is the new wording I propose for "results" (formerly named
> "resultHistory"). The only change from my last proposed wording is the
> addition of the last sentence.
>
>
>
>     "The array of all current recognition results for this session.
> Specifically all final results that have been returned, followed by the
> current best hypothesis for all interim results. It consists of zero or more
> final results followed by zero or more interim results. On subsequent
> SpeechRecognitionResultEvent events, interim results may be overwritten by a
> newer interim result or by a final result or may be removed (when at the end
> of the "results" array and the array length decreases). Final results cannot
> be overwritten or removed. All entries for indexes less than resultIndex
> must be identical to the array that was present when the last
> SpeechRecognitionResultEvent was raised.  All array entries (if any) for
> indexes equal or greater than resultIndex that were present in the array
> when the last SpeechRecognitionResultEvent was raised are removed and
> overwritten with new results.  The length of the "results" array may
> increase or decrease, but cannot be less than resultIndex.  Note that when
> resultIndex == results.length, no new results are returned, this may occur
> when the array length decreases to remove one or more interim results.
>
>
>
> /Glen Shires
>
>
>
> On Thu, Aug 23, 2012 at 9:55 PM, Young, Milan <Milan.Young@nuance.com>
> wrote:
>
> Thanks for the clarification, this looks good.  But I’m a still a bit wary
> about the case where resultIndex == length.  At a minimum, we should add
> language warning that results[resultIndex] will not always return a valid
> element.  But even then, there’s a good chance that developers will miss
> that subtlety and simply rely on testing to make sure their app works.  The
> problem with that approach is that 99.9% of the time their bad assumption
> will hold true and they will probably miss the error only to find it later
> in production.
>
>
>
> What do you think about changing the wording of resultIndex to accommodate
> the exception of deleting the interim tail?  Could we perhaps add a new
> marker to signal a finalized state of the entire array?  I’m not married to
> either one of these ideas, just brainstorming.
>
>
>
> Thanks
>
>
>
>
>
> From: Glen Shires [mailto:gshires@google.com]
> Sent: Thursday, August 23, 2012 5:42 PM
>
>
> To: Young, Milan
> Cc: public-speech-api@w3.org
> Subject: Re: SpeechRecognitionEvent resultIndex / resultHistory
>
>
>
> Milan,
>
> Good points!  I changed the word "replaced" to "overwritten" and made a few
> other changes. Note that the case in which resultIndex equals the length of
> the array is useful when the last interim entry needs to be removed.  For
> example, suppose resultHistory represents this state:
>
>
>
> final: "To be"  final: " or not to be"  interim: " the"
>
>
>
> And then the recognizer determines that the interim result was not a
> continuation, but just superfluous noise, so it updates the state to:
>
>
>
> final: "To be"  final: " or not to be"
>
>
>
> To delete this last interim, it would send a SpeechRecognitionResultEvent
> with resultIndex = 2 and resultHistory.length = 2.  While this case may not
> ever occur with some recognizers, it's useful to support this case for any
> recognizers that require it.  Note also that the simple JavaScript loop to
> process results, that I suggested earlier, does not change, as it processes
> this case correctly as well:
>
>
>
> for (i = resultIndex;  i < resultHistory.length; ++i) {
>
>   // process resultHistory[i];
>
> }
>
>
>
>
>
> Here is the slightly updated wording I propose for resultHistory:
>
>
>
>     "The array of all current recognition results for this session.
> Specifically all final results that have been returned, followed by the
> current best hypothesis for all interim results. It consists of zero or more
> final results followed by zero or more interim results. On subsequent
> SpeechRecognitionResultEvent events, interim results may be overwritten by a
> newer interim result or by a final result or may be removed (when at the end
> of the resultHistory array and the array length decreases). Final results
> cannot be overwritten or removed. All entries for indexes less than
> resultIndex must be identical to the array that was present when the last
> SpeechRecognitionResultEvent was raised.  All array entries (if any) for
> indexes equal or greater than resultIndex that were present in the array
> when the last SpeechRecognitionResultEvent was raised are removed and
> overwritten with new results.  The length of the resultHistory array may
> increase or decrease, but cannot be less than resultIndex.
>
>
>
> /Glen Shires
>
>
>
> On Thu, Aug 23, 2012 at 4:50 PM, Young, Milan <Milan.Young@nuance.com>
> wrote:
>
> This is a step in the right direction, but I still think the wording for
> resultHistory needs work.  A couple concrete objections:
>
>   *  The opening sentence is misleading because resultHistory doesn’t
> capture all results in this session, but rather the current best hypothesis
> of results over the session.
>
>   * You have “the length of the array cannot be less than the resultIndex”,
> but doesn’t it always have to be greater?
>
>
>
> My last objection is fuzzy: I just found that paragraph hard to read.  I
> think the confusion centered on the use of the word “replaced”.  I found it
> odd because the event is delivering a “free standing” array, not a diff.  I
> understand that the underlying implementation may take a different view, but
> we are describing an API here, not a cookbook for implementers.  I’d be
> happy to suggest an alternative, but being that you and Hans are editors I
> figured I’d give you first shot.
>
>
>
> Thanks
>
>
>
>
>
> From: Glen Shires [mailto:gshires@google.com]
> Sent: Thursday, August 23, 2012 2:06 AM
> To: Young, Milan
> Cc: public-speech-api@w3.org
> Subject: Re: SpeechRecognitionEvent resultIndex / resultHistory
>
>
>
> Milan,
>
> Yes, I agree the wording needs to be clarified.  I also agree that "the case
> of correcting a previous interim while deleting the tail of the result list"
> is a reasonably common operation, and that case can be implemented with the
> following definitions.
>
>
>
> I propose the following wording for resultHistory:
>
>
>
>     "The array of all of the recognition results that have so far been
> returned as part of this session. It consists of zero or more final results
> followed by zero or more interim results. On subsequent
> SpeechRecognitionResultEvent events, interim results may be replaced by a
> newer interim result or by a final result. Final results cannot be replaced.
> All entries for indexes less than resultIndex must be identical to the array
> that was present when the last SpeechRecognitionResultEvent was raised.  All
> array entries for indexes equal or greater than resultIndex replace any
> prior entries that were present in the array (if any) when the last
> SpeechRecognitionResultEvent was raised.  The length of the resultHistory
> array may increase or decrease, but cannot be less than resultIndex.
>
>
>
> I propose the following wording for resultIndex:
>
>
>
>     "The resultIndex must be set to the lowest index in the resultHistory
> array that has changed. When continuous was false, the resultIndex must
> always be 0."
>
>
>
> I propose to eliminate the resultdeleted event because it results in
> inconsistent states, and because the above definition of resultHistory /
> resultIndex makes the resultdeleted event superfluous.
>
>
>
> I propose to eliminate SpeechRecognitionResultEvent.result because
> SpeechRecognitionResultEvent may (and often does) return multiple results.
> The JavaScript author can easily process all new results with code such as:
>
>
>
> for (i = resultIndex;  i < resultHistory.length; ++i) {
>
>   // process resultHistory[i];
>
> }
>
>
>
>
> /Glen Shires
>
> On Tue, Aug 21, 2012 at 10:31 PM, Young, Milan <Milan.Young@nuance.com>
> wrote:
>
> I agree with the spirit of the change, but I’m unsure about the wording.
>
>
>
> The result deleted event says “The resultIndex of this event will be the
> element that was deleted” and your text says “The resultIndex must be set to
> the lowest index in the resultHistory array that has changed.”  This
> combination would seem to preclude the case of correcting a previous interim
> while deleting the tail of the result list, which I would guess is a
> reasonably common operation.
>
>
>
>
>
> From: Glen Shires [mailto:gshires@google.com]
> Sent: Tuesday, August 21, 2012 8:09 AM
> To: public-speech-api@w3.org
> Subject: SpeechRecognitionEvent resultIndex / resultHistory
>
>
>
> As speech is processed, typically a portion of (but not all of) the interim
> results become final.  As portions become final, the interim hypotheses
> typically also change.  For example, the following sequence might occur.
> (Each line below represents one point in time.)
>
>
>
> interim: "Tube"
>
>
>
> interim: "To be born"
>
>
>
> interim: "To be or not to be"
>
>
>
> final: "To be"  interim: " or not to be there"
>
>
>
> final: "To be"  final: " or not to be"  interim: " that is"
>
>
>
> final: "To be"  final: " or not to be"  interim: " that is"  interim " the
> question"
>
>
>
> final: "To be"  final: " or not to be"  final: " that is the"  interim: "
> question what"
>
>
>
> final: "To be"  final: " or not to be"  final: " that is the"  final: "
> question."  interim: " Weather today"
>
>
>
> final: "To be"  final: " or not to be"  final: " that is the"  final: "
> question."  interim: " Whether tis nobler"
>
>
>
> final: "To be"  final: " or not to be"  final: " that is the"  final: "
> question."  final: " Whether"  interim: " tis nobler"
>
>
>
> final: "To be"  final: " or not to be"  final: " that is the"  final: "
> question."  final: " Whether"  final: " tis nobler"
>
>
>
>
>
> Our current spec doesn't support such simultaneous changes to both interim
> and final results. Instead, each SpeechRecognitionEvent returns only a
> single "final" or a single "interim" result.  I propose a simple change to
> enable SpeechRecognitionEvent to return multiple "final" and "interim"
> events. I believe this has the following advantages:
>
>
>
> - Provides more accurate results (it avoids inconsistent states in which the
> "final" has been returned but the "interim" has not yet been updated).
>
>
>
> - Provides more efficient processing (it reduces the number of events that
> JavaScript needs to respond to and, more importantly, it avoids the UI
> rendering of those inconsistent states).
>
>
>
> - It simplifies the JavaScript coding (by not having to detect or compensate
> for inconsistent states).
>
>
>
>
>
> Therefore, I propose a slight re-definition of resultIndex:
>
>
>
>     "The resultIndex must be set to the lowest index in the resultHistory
> array that has changed.  Entries at greater indexes in the resultHistory
> array (if any) may also have changed."
>
>
>
> followed by the rest of the existing definition of resultIndex:
>
>
>
>     "The resultIndex may refer to a previous occupied array index from a
> previous SpeechRecognitionResultEvent. When this is the case this new result
> overwrites the earlier result and is a more accurate result; however, when
> this is the case the previous value must not have been a final result. When
> continuous was false, the resultIndex must always be 0."
>
>
>
>
>
> And a slight re-definition of resultHistory:
>
>
>
>     "The array of all of the recognition results that have been returned as
> part of this session. All entries for indexes less than resultIndex must be
> identical to the array that was present when the last
> SpeechRecognitionResultEvent was raised.
>
>
>
>
>
> To illustrate, the fourth line in our example above would return the
> SpeechRecognitionResultEvent with
>
>   resultIndex = 0
>
>   resultHistory[0] = "To be", final = true,   resultHistory[1] = " or not to
> be there", final = false
>
>
>
> and the the fifth line in our example above would return the
> SpeechRecognitionResultEvent with
>
>   resultIndex = 1
>
>   resultHistory[0] = "To be", final = true,   resultHistory[1] = " or not to
> be", final = true,   resultHistory[2] = " that is", final = false
>
>
>
>
>
> /Glen Shires
>
>
>
>
>
>
>
>

Received on Friday, 24 August 2012 16:12:00 UTC