Re: [css3-speech] cue volume

Hi,
I have attempted to formulate a canonical expression of the concept of intrinsic volume level for sound clips (pre-recorded / pre-generated) so that authors can produce content with a good degree of confidence that TTS voices render with predictable volume levels (e.g. comparable loudness when no dB attenuation is specified). Reminder: the decibel adjustments are relative to keyword values, which are user "preferred" loudness settings (i.e. not known at authoring time). Let me know if this is satisfactory, at least for a transition to Last Call Working Draft :)

http://dev.w3.org/csswg/css3-speech/#cue-props

On 7 Jul 2011, at 19:33, fantasai wrote:

> On 07/07/2011 02:31 AM, Daniel Weck wrote:
>> Well, just to put things into perspective, let's say you have 2 pre-recorded audio clips, one for cue-before, one for
>> cue-after. The first one was recorded "normally" (whatever the convention is), whereas the second one is really loud on
>> average (for example, compressed waveform, narrow dynamic range). Unless the audio implementation is "clever" (e.g. automatic
>> normalization/equalization/filtering ... note that I am not an audio engineer), the user can't reduce the large variations of
>> perceived volume level. So authors obviously have a responsibility to prevent ear drum damage and to limit listening
>> inconvenience.
> 
> Right, my point is, how can they do that if they don't know how loud 'medium' is?
> How can the author balance the loudness of the audio cue to the loudness of the
> voice, when it's unknown how loud the voice is?

Received on Monday, 1 August 2011 21:18:48 UTC