W3C home > Mailing lists > Public > www-style@w3.org > August 2011

Re: [css3-speech] cue volume

From: Daniel Weck <daniel.weck@gmail.com>
Date: Mon, 1 Aug 2011 23:18:08 +0200
Message-Id: <A3A58CC0-6786-49F4-B05F-63BC8F5E1395@gmail.com>
To: www style <www-style@w3.org>, fantasai <fantasai.lists@inkedblade.net>
I have attempted to formulate a canonical expression of the concept of intrinsic volume level for sound clips (pre-recorded / pre-generated) so that authors can produce content with a good degree of confidence that TTS voices render with predictable volume levels (e.g. comparable loudness when no dB attenuation is specified). Reminder: the decibel adjustments are relative to keyword values, which are user "preferred" loudness settings (i.e. not known at authoring time). Let me know if this is satisfactory, at least for a transition to Last Call Working Draft :)


On 7 Jul 2011, at 19:33, fantasai wrote:

> On 07/07/2011 02:31 AM, Daniel Weck wrote:
>> Well, just to put things into perspective, let's say you have 2 pre-recorded audio clips, one for cue-before, one for
>> cue-after. The first one was recorded "normally" (whatever the convention is), whereas the second one is really loud on
>> average (for example, compressed waveform, narrow dynamic range). Unless the audio implementation is "clever" (e.g. automatic
>> normalization/equalization/filtering ... note that I am not an audio engineer), the user can't reduce the large variations of
>> perceived volume level. So authors obviously have a responsibility to prevent ear drum damage and to limit listening
>> inconvenience.
> Right, my point is, how can they do that if they don't know how loud 'medium' is?
> How can the author balance the loudness of the audio cue to the loudness of the
> voice, when it's unknown how loud the voice is?
Received on Monday, 1 August 2011 21:18:48 UTC

This archive was generated by hypermail 2.3.1 : Monday, 2 May 2016 14:38:48 UTC