Re: Interacting with WebRTC, the Web Audio API and other external sources from Dominic Mazzoni on 2012-07-19 (public-speech-api@w3.org from July 2012)

From: Dominic Mazzoni <dmazzoni@google.com>
Date: Thu, 19 Jul 2012 10:38:56 -0700
To: Peter Beverloo <beverloo@google.com>
Cc: public-speech-api@w3.org
Message-ID: <CAFz-FYzKXPYNdN=2DDmPs6nBPC1=JbB_O5LShw5WEj8osPw3Zg@mail.gmail.com>

I love the idea of standardizing interfaces and combining technologies.

I only have one potential concern, which is that the license for some
text-to-speech engines specifically forbids saving the audio data and
playing it back later - or it only allows it for personal use; for example
it's not allowed to stream the speech to multiple users without purchasing
a more expensive license.

- Dominic

On Wed, Jun 13, 2012 at 7:49 AM, Peter Beverloo <beverloo@google.com> wrote:

> 2) Storing and processing text-to-speech fragments.
>
> Rather than mandating immediate output of the synthesized audio stream, it
> should be considered to introduce an "outputStream" property on a
> TextToSpeech object which provides a MediaStream object. This allows the
> synthesized stream to be played through the <audio> element, processed
> through the Web Audio API or even to be stored locally for caching, in case
> the user is using a device which is not always connected to the internet
> (and when no local recognizer is available). Furthermore, this would allow
> websites to store the synthesized audio to a wave file and save this on the
> server, allowing it to be re-used for user agents or other clients which do
> not provide an implementation.
>
> The Web platform gains its power by the ability to combine technologies,
> and I think it would be great to see the Speech API playing a role in that.
>
> Best,
> Peter
>
> [1]
> http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html#speechreco-section
>

Received on Thursday, 19 July 2012 17:39:24 UTC