W3C home > Mailing lists > Public > whatwg@whatwg.org > December 2009

[whatwg] Web API for speech recognition and synthesis

From: Ian McGraw <imcgraw@mit.edu>
Date: Tue, 15 Dec 2009 15:53:55 -0500
Message-ID: <f907fd300912151253p4df7008bxb85b1e45ecb158c@mail.gmail.com>
Great!  As I've said, I'm definitely bias towards this approach.  As Bjorn
hinted AJAX APIs could be developed with all sorts of interesting features
that will never make it down into the browser, e.g. pronunciation
assessment, speech therapy, all those lie-detector apps for your phone :-).
Still, I think that we're missing the biggest pro:

- Pro:  Speech recognition technology is data-driven.  Improvements in the
underlying technology are far more likely to occur with a network driven
approach.

To be fair, with that, you have to add a con:

- Con:  Less privacy.

-Ian

On Tue, Dec 15, 2009 at 3:37 PM, Ian Hickson <ian at hixie.ch> wrote:

> On Tue, 15 Dec 2009, Bjorn Bringert wrote:
> >
> > - A general microphone API + streaming API + audio tag
> >   - Pro: Useful for non-speech recognition / synthesis applications.
> >            E.g. audio chat, sound recording.
> >   - Pro: Allows JavaScript libraries for third-party network speech
> services.
> >            E.g. an AJAX API for Google's speech services. Web app
> developers
> >            that don't have their own speech servers could use that.
> >   - Pro: Consistent recognition / synthesis user experience across
> >             user agents in the same web app.
> >   - Con: No support for on-device recognition / synthesis, only
> >             network services.
> >   - Con: Varying recognition / synthesis user experience across
> >             different web apps in a single user agent.
> >   - Con: Possibly higher overhead because the audio data needs to
> >             pass through JavaScript.
> >   - Con: Requires dealing with audio encodings, endpointing, buffer
> >             sizes etc in the microphone API.
>
> FWIW I've started looking at this kind of thing in general (for audio and
> video -- see <device> in the spec for the first draft ideas), since it'll
> be required for other things as well. However, that shouldn't be taken as
> a sign that the other approach shouldn't also be examined.
>
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20091215/bbe80805/attachment.htm>
Received on Tuesday, 15 December 2009 12:53:55 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:54 UTC