Re: Web app API: speech recognition events

>From today's call:

- Agreement: Add a nomatch event, with interface
SpeechInputResultEvent, and remove the SPEECH_INPUT_ERR_NO_MATCH error
code.

- Not agreed: Maybe add a nospeech event, with interface Event and
remove the SPEECH_INPUT_ERR_NO_SPEECH error code.

- Not agreed: Maybe add a aborted event, with interface Event and
remove the SPEECH_INPUT_ERR_ABORTED error code.

/Bjorn

On Thu, Jun 30, 2011 at 12:57 PM, Bjorn Bringert <bringert@google.com> wrote:
> Here is a first draft of IDL and semantics for the speech recognition
> events in the web app API.
>
> This does not include the events needed for continuous recognition, as
> that is part of Debbie's work item.
>
> == IDL ==
>
> interface SpeechInputRequest {
>   // ... other speech recognition functionality ...
>
>   attribute Function onaudiostart;
>   attribute Function onsoundstart;
>   attribute Function onspeechstart;
>   attribute Function onspeechend;
>   attribute Function onsoundend;
>   attribute Function onaudioend;
>   attribute Function onresult;
>   attribute Function onerror;
> };
> SpeechInputRequest implements EventTarget;
>
> interface SpeechInputResultEvent : Event {
>   readonly attribute SpeechInputResult result;
> };
>
> interface SpeechInputErrorEvent : Event {
>   readonly attribute SpeechInputError error;
> };
>
> interface SpeechInputError {
>  const unsigned short SPEECH_INPUT_ERR_OTHER = 0;
>
>  // The following code have not been agreed. I include them anyway to
> have a list to start with.
>  // This is roughly a union of the error code sets from the Microsoft
> and Google proposals.
>
>  // Speech was detected, but it could not be recognized.
>  const unsigned short SPEECH_INPUT_ERR_NO_MATCH = 1;
>  // No speech was detected.
>  const unsigned short SPEECH_INPUT_ERR_NO_SPEECH = 2;
>  // Speech input was aborted by calling cancel(), or by some
> UA-specific behavior such as
>  // UI that lets the user cancel speech input.
>  const unsigned short SPEECH_INPUT_ERR_ABORTED = 3;
>  // Audio capture failed.
>  const unsigned short SPEECH_INPUT_ERR_AUDIO_CAPTURE = 4;
>  // Some network communication that was required to complete the
> recognition failed.
>  const unsigned short SPEECH_INPUT_ERR_NETWORK = 5;
>  // The user agent is not allowing any speech input to occur for
> reasons of security, privacy or user preference.
>  const unsigned short SPEECH_INPUT_ERR_NOT_ALLOWED = 6;
>  // The user agent is not allowing the web application requested
> speech service, but would allow some speech service,
>  // to be used either because the user agent doesn't support the
> selected one or because of reasons of security, privacy
>  // or user preference.
>  const unsigned short SPEECH_INPUT_ERR_SERVICE_NOT_ALLOWED = 7;
>  // There was an error in the speech recognition grammar.
>  const unsigned short SPEECH_INPUT_ERR_BAD_GRAMMAR = 8;
>  const unsigned short SPEECH_INPUT_ERR_LANGUAGE_NOT_SUPPORTED = 9;
>
>  // One of the constants above.
>  readonly attribute unsigned short code;
>  // The message attribute must return an error message describing the
> details of the error encountered.
>  // The message content is implementation specific. This attribute is
> primarily intended for debugging and
>  // developers should not use it directly in their application user interface.
>  readonly attribute DOMString message;
> };
>
> interface SpeechInputResult {
>   // Debbie's work item
> };
>
>
> == Description ==
>
> The DOM Level 2 Event Model is used for speech recognition events. The
> methods in the EventTarget interface should be used for registering
> event listeners. The SpeechInputRequest interface also contains
> convenience attributes for registering a single event handler for each
> event type.
>
> For all these events, the timeStamp attribute defined in the DOM Level
> 2 Event interface must be set to the best possible estimate of when
> the real-world event which the event object represents occurred.
>
> Unless specified below, the ordering of the different events is
> undefined. For example, some implementations may fire audioend before
> speechstart or speechend if the audio detector is client-side and the
> speech detector is server-side.
>
>
> == List of events ==
>
> For each event, we list the name, the interface of the event object,
> and a description.
>
> audiostart, interface: Event
> Fired when the user agent has started to capture audio.
>
> soundstart, interface: Event
> Some sound, possibly speech, has been detected. This must be fired
> with low latency, e.g. by using a client-side energy detector.
>
> speechstart, interface: Event
> The speech that will be used for speech recognition has started.
>
> speechend, interface: Event
> The speech that will be used for speech recognition has ended.
> speechstart must always have been fire before speechend.
>
> soundend, interface: Event
> Some sound is no longer detected. This must be fired with low latency,
> e.g. by using a client-side energy detector. soundstart must always
> have been fired before soundend.
>
> audioend, interface: Event
> Fired when the user agent has finished capturing audio. audiostart
> must always have been fired before audioend.
>
> result, interface: SpeechInputResultEvent
> Fired when the speech recognizer returns a final result with at least
> one recognition hypothesis. The result field in the event contains the
> speech recognition result. All the following events must have been
> fired before result is fired: audiostart, soundstart, speechstart,
> speechend, soundend, audioend.
>
> error, interface: SpeechInputErrorEvent
> Fired when a speech recognition error occurs. The error attribute is
> set to a SpeechInputError object.
> After an error event is fired, no further events will be fired for the
> given speech input request.
>
>
>
>
> --
> Bjorn Bringert
> Google UK Limited, Registered Office: Belgrave House, 76 Buckingham
> Palace Road, London, SW1W 9TQ
> Registered in England Number: 3977902
>



-- 
Bjorn Bringert
Google UK Limited, Registered Office: Belgrave House, 76 Buckingham
Palace Road, London, SW1W 9TQ
Registered in England Number: 3977902

Received on Thursday, 30 June 2011 17:42:01 UTC