- From: Daniel Weck <daniel.weck@gmail.com>
- Date: Mon, 1 Aug 2011 23:18:08 +0200
- To: www style <www-style@w3.org>, fantasai <fantasai.lists@inkedblade.net>
Hi, I have attempted to formulate a canonical expression of the concept of intrinsic volume level for sound clips (pre-recorded / pre-generated) so that authors can produce content with a good degree of confidence that TTS voices render with predictable volume levels (e.g. comparable loudness when no dB attenuation is specified). Reminder: the decibel adjustments are relative to keyword values, which are user "preferred" loudness settings (i.e. not known at authoring time). Let me know if this is satisfactory, at least for a transition to Last Call Working Draft :) http://dev.w3.org/csswg/css3-speech/#cue-props On 7 Jul 2011, at 19:33, fantasai wrote: > On 07/07/2011 02:31 AM, Daniel Weck wrote: >> Well, just to put things into perspective, let's say you have 2 pre-recorded audio clips, one for cue-before, one for >> cue-after. The first one was recorded "normally" (whatever the convention is), whereas the second one is really loud on >> average (for example, compressed waveform, narrow dynamic range). Unless the audio implementation is "clever" (e.g. automatic >> normalization/equalization/filtering ... note that I am not an audio engineer), the user can't reduce the large variations of >> perceived volume level. So authors obviously have a responsibility to prevent ear drum damage and to limit listening >> inconvenience. > > Right, my point is, how can they do that if they don't know how loud 'medium' is? > How can the author balance the loudness of the audio cue to the loudness of the > voice, when it's unknown how loud the voice is?
Received on Monday, 1 August 2011 21:18:48 UTC