SpeechRecognitionEvent resultIndex / resultHistory

As speech is processed, typically a portion of (but not all of) the interim
results become final.  As portions become final, the interim hypotheses
typically also change.  For example, the following sequence might occur.
(Each line below represents one point in time.)

interim: "Tube"

interim: "To be born"

interim: "To be or not to be"

final: "To be"  interim: " or not to be there"

final: "To be"  final: " or not to be"  interim: " that is"

final: "To be"  final: " or not to be"  interim: " that is"  interim " the
question"

final: "To be"  final: " or not to be"  final: " that is the"  interim: "
question what"

final: "To be"  final: " or not to be"  final: " that is the"  final: "
question."  interim: " Weather today"

final: "To be"  final: " or not to be"  final: " that is the"  final: "
question."  interim: " Whether tis nobler"

final: "To be"  final: " or not to be"  final: " that is the"  final: "
question."  final: " Whether"  interim: " tis nobler"

final: "To be"  final: " or not to be"  final: " that is the"  final: "
question."  final: " Whether"  final: " tis nobler"


Our current spec doesn't support such simultaneous changes to both interim
and final results. Instead, each SpeechRecognitionEvent returns only a
single "final" or a single "interim" result.  I propose a simple change to
enable SpeechRecognitionEvent to return multiple "final" and "interim"
events. I believe this has the following advantages:

- Provides more accurate results (it avoids inconsistent states in which
the "final" has been returned but the "interim" has not yet been updated).

- Provides more efficient processing (it reduces the number of events that
JavaScript needs to respond to and, more importantly, it avoids the UI
rendering of those inconsistent states).

- It simplifies the JavaScript coding (by not having to detect or
compensate for inconsistent states).


Therefore, I propose a slight re-definition of resultIndex:

    "The resultIndex must be set to the lowest index in
the resultHistory array that has changed.  Entries at greater indexes in
the resultHistory array (if any) may also have changed."

followed by the rest of the existing definition of resultIndex:

    "The resultIndex may refer to a previous occupied array index from a
previous SpeechRecognitionResultEvent. When this is the case this new
result overwrites the earlier result and is a more accurate result;
however, when this is the case the previous value must not have been a
final result. When continuous was false, the resultIndex must always be 0."


And a slight re-definition of resultHistory:

    "The array of all of the recognition results that have been returned as
part of this session. All entries for indexes less than resultIndex must be
identical to the array that was present when the last
SpeechRecognitionResultEvent was raised.


To illustrate, the fourth line in our example above would return
the SpeechRecognitionResultEvent with
  resultIndex = 0
  resultHistory[0] = "To be", final = true,   resultHistory[1] = " or not
to be there", final = false

and the the fifth line in our example above would return the
SpeechRecognitionResultEvent with
  resultIndex = 1
  resultHistory[0] = "To be", final = true,   resultHistory[1] = " or not
to be", final = true,   resultHistory[2] = " that is", final = false


/Glen Shires

Received on Tuesday, 21 August 2012 15:09:53 UTC