W3C home > Mailing lists > Public > public-audio@w3.org > January to March 2012

Re: speech speed up/down (Was: Building UC1 (Video Chat) from the WebRTC use case)

From: Thierry MICHEL <tmichel@w3.org>
Date: Tue, 17 Jan 2012 10:22:52 +0100
Message-ID: <4F153DEC.3030200@w3.org>
To: Olivier Thereaux <olivier.thereaux@bbc.co.uk>
CC: public-audio@w3.org
Olivier,


Here is a first draft for UC9 about Audio Deceleration/ Acceleration.
I have drafted 4 different relevant scenarios, we may pick a few of those.

Thoughts ?


Thierry



UC-9-a : Audio Deceleration.

A user is listening to the web-cast of an audio interview available from 
the web-page of a radio broadcasting streaming service.
The interview is broad casted in Spanish, unfortunately not the native 
language of the user.
Therefore the user would like to listen to the audio web-cast at a 
slower speed (time stretching), allowing a better understanding of the 
dialogs of the conversation in this language for which he is not fluent.
The user would like listen to the audio broadcast, without any pitch 
distortion of the voices.

The web-page presents a graphic visualization of the speed of the audio 
conversation.
The web-page also associates an interface provided by the web-page 
developer allowing the user to
to change the speed, and may allow to tweak other settings like the tone 
and timbre to his taste.

This would be valuable accessibility features for audio listeners who 
want to allow more time to better understand web-cast as well as audio 
books.

------------------------------

UC-9-b : Audio Acceleration.

A user is  subscribed to a podcast, and has downloaded an audio book on 
his device.
The audio files are stored locally on the user's computer or other 
device ready for off line use, giving simple and convenient access to 
episodic content, through a web browser.

The user is sitting in an airplane, for a 2 hours flight. The user opens 
his audio book in his HTML browser a sees that the episode he has 
selected lasts 3 hours.
The user would like to be able to accelerate the speed of the audio 
book, without pitch distortion (i.e., voices not sounding like 
“chipmunks” when accelerated). He would like to set the audition time to 
2 hours in order to finish the audio book before landing.

The web-page presents a graphic visualization of the speed, the total 
duration of the audio on a time line at the corresponding speed.
The web-page also associates an audio speed changer interface provided 
by the web-page developer allowing the user  to change the tempo of the 
speech and speed up audio files without changing the pitch. This lets 
the user drastically speed up speeches without a "chipmunk" effect.

Another interface allows the user to set the duration of the audio, 
regarding its initial duration at normal speed, therefore changing its 
speed with pitch lock.
The user may also tweak other settings like the tone and timbre to his 
taste.

This would be valuable features for book listeners who want to save time 
by accelerating audio books as well as podcasts.

-----------

UC-9-c : Audio Deceleration/Acceleration.

A disc jockey (DJ) selects and plays recorded music for a discotheque 
audience. The DJ uses a radio broadcasting streaming service to play the 
music live.
The DJ is selecting songs from a playlist available on the web-page of 
the streaming service and wants to beatmix and cross fade songs for 
smooth dance transitions.
He brings the beat of the next song into phase with the current one 
playing and fades across.
For example, if the song the audience is hearing is 125 Beats Per Minute 
(bpm), and the next song he wants to play is 128 bpm, the DJ will slow 
the second song down to 125 bpm using pitch control, and cue it up to 
the beat. When he is ready to bring the second song into play, he throws 
the recording so the beats stay aligned and listen to it in his 
headphones. The DJ makes sure both are in sync. Then he uses a  cross 
fader to let the new song blend into the old one, and eventually goes 
completely across so only the new song is playing. This gives the 
illusion that the song never ended.


The web-page presents a graphic visualization of the audio songs 
selected and played. It displays the Beats Per Minute (bpm).
The web-page also associates an audio interface with pitch control to 
change the tempo of a song, very useful for  beat matching.

For further audio effect, the interface may also integrate a pitch-wheel 
allowing to change the pitch of a sound without changing it's length.

These would be valuable features for DJs who want to beat-sync music on 
line, like they are used to do with decks and turntables.



UC-9-d : Audio Deceleration/Acceleration.

Another use cases would be a user editing a video with a film editor 
tool embedded in a web-page on line. The user would like to synchronize 
a video with an audio track having different durations. The user would 
slow or accelerate the audio track, to match the duration of the video 
sequence.

The web-page presents a graphic visualization of the audio track to be 
mixed with the video sequence. Video clips are arranged on a timeline, 
in parralel with the audio tracks. It shows the duration of each audio 
and video components.

The web-page also associates an audio interface with speed control to 
change the tempo of the track, very useful for synchronization.












Le 16/01/2012 18:08, Olivier Thereaux a écrit :
> Hello Thierry,
>
> On 16/01/2012 14:37, Thierry MICHEL wrote:
>> Le 16/01/2012 13:18, Olivier Thereaux a écrit :
>> but there seem to be a
>>> case for such a feature in other contexts (playback of spoken material,
>>> mostly, be it recorded or synthesised).
>>>
>>> Would you be interested in drafting a relevant use case scenario?
>>
>> I can try it, but could I have more material on that issue ?
>> I could not find the proper thread on this particular topic nor on the
>> wiki
>> http://www.w3.org/2011/audio/wiki/Use-Cases
>
> No, there is nothing about it yet.
> How about getting started on a UC9 - language learning software on the
> wiki?
>
Received on Tuesday, 17 January 2012 09:23:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 17 January 2012 09:23:20 GMT