W3C home > Mailing lists > Public > public-audio@w3.org > January to March 2012

Re: speech speed up/down (Was: Building UC1 (Video Chat) from the WebRTC use case)

From: Olivier Thereaux <olivier.thereaux@bbc.co.uk>
Date: Tue, 17 Jan 2012 10:27:06 +0000
Message-ID: <4F154CFA.5020401@bbc.co.uk>
To: tmichel@w3.org
CC: public-audio@w3.org
Hello Thierry,

Really good stuff, and all real-life use cases.

If we want to keep only a few (for the sake of keeping the number of 
scenarios reasonably low) I suspect we could merge the first two (start 
with 9b then the user could switch to a programme in another language 
than his mother tongue, thus preferring to slow it down) and drop the 
last one (I think all the requirements of 9d are covered by a,b, and c).

-- 
Olivier



On 17/01/2012 09:22, Thierry MICHEL wrote:
> Olivier,
>
>
> Here is a first draft for UC9 about Audio Deceleration/ Acceleration.
> I have drafted 4 different relevant scenarios, we may pick a few of those.
>
> Thoughts ?
>
>
> Thierry
>
>
>
> UC-9-a : Audio Deceleration.
>
> A user is listening to the web-cast of an audio interview available from
> the web-page of a radio broadcasting streaming service.
> The interview is broad casted in Spanish, unfortunately not the native
> language of the user.
> Therefore the user would like to listen to the audio web-cast at a
> slower speed (time stretching), allowing a better understanding of the
> dialogs of the conversation in this language for which he is not fluent.
> The user would like listen to the audio broadcast, without any pitch
> distortion of the voices.
>
> The web-page presents a graphic visualization of the speed of the audio
> conversation.
> The web-page also associates an interface provided by the web-page
> developer allowing the user to
> to change the speed, and may allow to tweak other settings like the tone
> and timbre to his taste.
>
> This would be valuable accessibility features for audio listeners who
> want to allow more time to better understand web-cast as well as audio
> books.
>
> ------------------------------
>
> UC-9-b : Audio Acceleration.
>
> A user is subscribed to a podcast, and has downloaded an audio book on
> his device.
> The audio files are stored locally on the user's computer or other
> device ready for off line use, giving simple and convenient access to
> episodic content, through a web browser.
>
> The user is sitting in an airplane, for a 2 hours flight. The user opens
> his audio book in his HTML browser a sees that the episode he has
> selected lasts 3 hours.
> The user would like to be able to accelerate the speed of the audio
> book, without pitch distortion (i.e., voices not sounding like
> “chipmunks” when accelerated). He would like to set the audition time to
> 2 hours in order to finish the audio book before landing.
>
> The web-page presents a graphic visualization of the speed, the total
> duration of the audio on a time line at the corresponding speed.
> The web-page also associates an audio speed changer interface provided
> by the web-page developer allowing the user to change the tempo of the
> speech and speed up audio files without changing the pitch. This lets
> the user drastically speed up speeches without a "chipmunk" effect.
>
> Another interface allows the user to set the duration of the audio,
> regarding its initial duration at normal speed, therefore changing its
> speed with pitch lock.
> The user may also tweak other settings like the tone and timbre to his
> taste.
>
> This would be valuable features for book listeners who want to save time
> by accelerating audio books as well as podcasts.
>
> -----------
>
> UC-9-c : Audio Deceleration/Acceleration.
>
> A disc jockey (DJ) selects and plays recorded music for a discotheque
> audience. The DJ uses a radio broadcasting streaming service to play the
> music live.
> The DJ is selecting songs from a playlist available on the web-page of
> the streaming service and wants to beatmix and cross fade songs for
> smooth dance transitions.
> He brings the beat of the next song into phase with the current one
> playing and fades across.
> For example, if the song the audience is hearing is 125 Beats Per Minute
> (bpm), and the next song he wants to play is 128 bpm, the DJ will slow
> the second song down to 125 bpm using pitch control, and cue it up to
> the beat. When he is ready to bring the second song into play, he throws
> the recording so the beats stay aligned and listen to it in his
> headphones. The DJ makes sure both are in sync. Then he uses a cross
> fader to let the new song blend into the old one, and eventually goes
> completely across so only the new song is playing. This gives the
> illusion that the song never ended.
>
>
> The web-page presents a graphic visualization of the audio songs
> selected and played. It displays the Beats Per Minute (bpm).
> The web-page also associates an audio interface with pitch control to
> change the tempo of a song, very useful for beat matching.
>
> For further audio effect, the interface may also integrate a pitch-wheel
> allowing to change the pitch of a sound without changing it's length.
>
> These would be valuable features for DJs who want to beat-sync music on
> line, like they are used to do with decks and turntables.
>
>
>
> UC-9-d : Audio Deceleration/Acceleration.
>
> Another use cases would be a user editing a video with a film editor
> tool embedded in a web-page on line. The user would like to synchronize
> a video with an audio track having different durations. The user would
> slow or accelerate the audio track, to match the duration of the video
> sequence.
>
> The web-page presents a graphic visualization of the audio track to be
> mixed with the video sequence. Video clips are arranged on a timeline,
> in parralel with the audio tracks. It shows the duration of each audio
> and video components.
>
> The web-page also associates an audio interface with speed control to
> change the tempo of the track, very useful for synchronization.
>
>
>
>
>
>
>
>
>
>
>
>
> Le 16/01/2012 18:08, Olivier Thereaux a écrit :
>> Hello Thierry,
>>
>> On 16/01/2012 14:37, Thierry MICHEL wrote:
>>> Le 16/01/2012 13:18, Olivier Thereaux a écrit :
>>> but there seem to be a
>>>> case for such a feature in other contexts (playback of spoken material,
>>>> mostly, be it recorded or synthesised).
>>>>
>>>> Would you be interested in drafting a relevant use case scenario?
>>>
>>> I can try it, but could I have more material on that issue ?
>>> I could not find the proper thread on this particular topic nor on the
>>> wiki
>>> http://www.w3.org/2011/audio/wiki/Use-Cases
>>
>> No, there is nothing about it yet.
>> How about getting started on a UC9 - language learning software on the
>> wiki?
>>

-- 
Olivier Thereaux
BBC Internet Research & Future Services



Received on Tuesday, 17 January 2012 10:27:31 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 17 January 2012 10:27:31 GMT