Re: Suggested change to <audio/> from Shane Smith on 2006-04-27 (www-voice@w3.org from April to June 2006)

From: Shane Smith <safarishane@gmail.com>
Date: Thu, 27 Apr 2006 13:10:07 -0500
To: "Dan Evans" <devans@invores.com>
Cc: www-voice@w3.org
Message-ID: <8fc15e140604271110w7245044cgb60a094843b92370@mail.gmail.com>

Dan,

You are right, but I'm also worried about http caching... each
different user id would have to be fetched from the origin server this
way... where if you are just stacking/arranging prompts on the client
side, your asking for pre-cached audio.

if 1.wav, 2.wav, and 3.wav are considered fresh by the client browser,
then 123, 213, 312, 111, etc all can be read without a http fetch.  If
it's cgi, each one 'could' be cached, but you'd have to cache every
permutation.

Thanks,
Shane


On 4/27/06, Dan Evans <devans@invores.com> wrote:
>
> In essence, you want to create a composite-audio object that will be
> treated by the language as a single URI.  Although not a language
> solution, you could do:
>
> <audio expr="concatAudio.cgi?digits=123456">
> <say-as interpret-as="digits'>123456</say-as>
> </audio>
>
> where the host logic returns "Not Found" if any of the implied files has
> a problem, and otherwise returns the concatenated audio stream.
>
> Dan Evans
>
> Shane Smith wrote:
> > Hey Folks,
> >
> > One of the best features of audio is the ability to play backup tts
> > should the audio source be unavailable.  Currently though, <audio/>
> > requires either the src or expr attributes to be listed, or a badfetch
> > is tossed.  Well, I've come upon a scenario where I wouldn't
> > necessarily want to list either, and use the backup tts feature in a
> > way that wasn't anticipated, but could be very useful.
> >
> > For example, let's say I'm playing an account number to the caller:
> >   <audio expr="AudioDirectory+'1.wav'">1</audio>
> >   <audio expr="AudioDirectory+'2.wav'">2</audio>
> >   <audio expr="AudioDirectory+'3.wav'">3</audio>
> >   <audio expr="AudioDirectory+'4.wav'">4</audio>
> >   <audio expr="AudioDirectory+'5.wav'">5</audio>
> >   <audio expr="AudioDirectory+'6.wav'">6</audio>
> >
> > I'm not going to prerecord every possible number, so I play audio one
> > digit at a time.  But, if for any reason one or more of them is
> > unavailable, I would rather the whole thing be read back as TTS.   The
> > above code sounds horrible as backup tts, and I do not really have
> > ssml control over it.
> >
> > What I would like to see is this:
> >
> > <audio>
> >   <audio expr="AudioDirectory+'1.wav'/>
> >   <audio expr="AudioDirectory+'2.wav'/>
> >   <audio expr="AudioDirectory+'3.wav'/>
> >   <audio expr="AudioDirectory+'4.wav'/>
> >   <audio expr="AudioDirectory+'5.wav'/>
> >   <audio expr="AudioDirectory+'6.wav'/>
> >   <prosody rate="-10%">
> >   <say-as interpret-as="digits">123456</say-as>
> >   </prosody>
> > </audio>
> >
> > This (or something similar) would allow you to chain a bunch of audio
> > prompts together, but if any of them fail, have a single backup tts
> > prompt replaced for all of them.  In my head, if all numbers wav files
> > were unavailable, or even if just 5.wav were unavailable, none of them
> > would play, and the ssml'ed TTS would play instead.
> >
> > Thoughts?
> >
> > Thanks,
> > Shane Smith
> >
> >
> >
>
>
>

Received on Thursday, 27 April 2006 18:10:17 UTC