- From: Geoff Freed <geoff_freed@wgbh.org>
- Date: Fri, 10 Feb 2012 17:02:50 +0000
- To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- CC: "public-texttracks@w3.org" <public-texttracks@w3.org>
Hi, Silvia: In our tests we didn't have problems with latency with the pre-recorded descriptions, but please let me know if you think you're hearing descriptions that are late. Since all this is being done server-side, there's always the possibility that slow connections or congestion will cause a delay. Note that in most of the demos, the descriptions shouldn't step on the program audio. There are a few exceptions: in the Ming clip we have to dodge the chef's non-stop monolog (we specifically included this clip to show how the use of ducking can aid in situations where you must sacrifice some program audio); in Sintel, we play descriptions over the fight scene. I agree that client-side processing is definitely worth some research and experimentation. Geoff/NCAM On 2/9/12 4:35 PM, "Silvia Pfeiffer" <silviapfeiffer1@gmail.com> wrote: >Hi Geoff, > >That is indeed very interesting. I'd be curious how you're going with >the pre-recorded pieces and the download speed - is it fast enough? My >suspicion is that doing the synthesis on the client will lead to a >much more responsive system, but it'd be good to get that confirmed >with actual experiments. > >Regards, >Silvia. > >On Thu, Feb 9, 2012 at 11:46 PM, Geoff Freed <geoff_freed@wgbh.org> wrote: >> >> Hello, everybody: >> >> IBM-Research Tokyo recently partnered with the Carl and Ruth Shapiro >>Family >> National Center for Accessible Media (NCAM) at WGBH to research ways to >> deliver online audio descriptions using text-to-speech (TTS) methods. >>IBM >> and NCAM explored two approaches which exploit new HTML5 media elements, >> Javascript and TTML: >> >> -- Writing and time-stamping a description script, then delivering the >> descriptions as hidden text in real time in such a way that a user's >>screen >> reader will read them aloud. The descriptions remain otherwise >>invisible and >> inaudible to non-screen-reader users. >> -- Writing and time-stamping descriptions, then recording them using TTS >> technology. At the time of playback, each description is individually >> retrieved and played aloud at intervals corresponding to the >>time-stamped >> script. >> >> Visit http://ncamftp.wgbh.org/ibm/dvs/ to learn more about the project, >>view >> the demonstration models and download the code to see how it works. >> >> Thanks. >> Geoff Freed >> WGBH/NCAM >> (with apologies for cross-posts) >>
Received on Friday, 10 February 2012 17:03:29 UTC