about Microphone API from Olli Pettay on 2011-04-07 (public-xg-htmlspeech@w3.org from April 2011)

From: Olli Pettay <Olli.Pettay@helsinki.fi>
Date: Thu, 07 Apr 2011 12:19:21 -0700
To: "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
Message-ID: <4D9E0E39.20706@helsinki.fi>

Hi,

as the last or almost last comment in the conf call there was something
about microphone API.

As Dan mentioned there has been lots of work happening in
RTC and related areas. HTML spec (WhatWG) has now a
proposal for audio/video conferencing, but there
are also other proposals.
One about audio handling (not about communication) is
https://wiki.mozilla.org/MediaStreamAPI

For handling audio and video it seems that all
the proposals are using some kind of Stream object.
So, if the recognizer API was using a Stream as an input,
we wouldn't need to care microphone API. This approach would
also let us rely on the other specs to handle many
security and privacy related issues.
(Of course we'd need to choose which Stream API to use, but
  that is more broad problem atm. Browsers will need to implement just
  one API, but what that will look like exactly isn't clear yet.)

The API could be, for example, close to SpeechRequest/SpeechRecognizer,
but instead of using the default microphone, or CaptureAPI,
there could be an attribute for the Stream.

[Constructor(in optional DOMString recognizerURI,
              in optional DOMString recognizerParams)]
interface Recognizer {
   attribute Stream input;
   ....

This would allow using all sorts of audio streams, not only microphone.
(For example for Streams from other users via VoIP/RTC, or
  audio from a video so that web app could do automatic subtitling.
  I know, these examples are something for the future.)



-Olli

Received on Thursday, 7 April 2011 19:19:54 UTC