W3C home > Mailing lists > Public > whatwg@whatwg.org > December 2009

[whatwg] Web API for speech recognition and synthesis

From: Ian Hickson <ian@hixie.ch>
Date: Tue, 15 Dec 2009 20:37:36 +0000 (UTC)
Message-ID: <Pine.LNX.4.62.0912152036220.15825@hixie.dreamhostps.com>
On Tue, 15 Dec 2009, Bjorn Bringert wrote:
> 
> - A general microphone API + streaming API + audio tag
>   - Pro: Useful for non-speech recognition / synthesis applications.
>            E.g. audio chat, sound recording.
>   - Pro: Allows JavaScript libraries for third-party network speech services.
>            E.g. an AJAX API for Google's speech services. Web app developers
>            that don't have their own speech servers could use that.
>   - Pro: Consistent recognition / synthesis user experience across
>             user agents in the same web app.
>   - Con: No support for on-device recognition / synthesis, only
>             network services.
>   - Con: Varying recognition / synthesis user experience across
>             different web apps in a single user agent.
>   - Con: Possibly higher overhead because the audio data needs to
>             pass through JavaScript.
>   - Con: Requires dealing with audio encodings, endpointing, buffer
>             sizes etc in the microphone API.

FWIW I've started looking at this kind of thing in general (for audio and 
video -- see <device> in the spec for the first draft ideas), since it'll 
be required for other things as well. However, that shouldn't be taken as 
a sign that the other approach shouldn't also be examined.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 15 December 2009 12:37:36 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:54 UTC