Audio Contrast Testability issues. from David MacDonald on 2004-11-18 (w3c-wai-gl@w3.org from October to December 2004)

From: David MacDonald <befree@magma.ca>
Date: Thu, 18 Nov 2004 13:55:08 -0500
To: <w3c-wai-gl@w3.org>
Message-Id: <200411181855.iAIIt9wN017111@mail3.magma.ca>
On last week's call I provided my findings and suggestions on Guideline 1.4,
particularly as it applied to audio contrast. The question of testability
came up. I mentioned that it would be difficult to test it. Gregg suggested
a technique of sampling the track as follows:

 

1)       Measure the volume of the track (in DB's) at a point where no one
is speaking (background only)

2)       Measure the volume when someone is speaking over the background

3)       Compare the two measurements to ensure there is at least a 20db
difference between the two samples

 

I have no problem with us presenting this as a technique for measuring audio
contrast. There are however some serious considerations. I think we would
have to set the following conditions:

 

1)       The audio background would need to be at a similar volume in both
samples. 

2)       There cannot be any compression/expansion applied to the track.
(Currently, media uses compression which would skew the results of the
measurements.)

 

The practical effect of compression is that the background is actually
louder when there is no talking than when there is talking. When there is
talking the entire recording (background and talking) is squashed down. So
under these conditions the background would rarely be the same volume for
the sample taken when there is no speaking and the sample taken when there
is speaking. However, the recording may be perfectly accessible because when
the person is talking the background is squashed and when the person is
silent the background is expanded. Perhaps I could work out an algorithm
which compensates for compression and would not skew results.

 

Additional info: 

Here's a simplified explanation of the way compression works. (I'll leave
out peak limiting for now which is a special case of compression.) Every
radio station, TV station, (and musician) wants their signal to be as loud
as possible over the airwaves. They want to use up all the headroom. The way
they make that happen is to apply compression to the final mastering of
their recordings and then boost the overall compressed signal. Compression
marks a threshold, say of -15db. Every part of the recording above the
threshold will be squashed at a predetermined ratio, say 2:1. In this
example every part of the signal over -15db is half as quiet as it would
normally be. This squashes the loud parts of the recording. Then the overall
signal is boosted so the entire track fills up the headroom (meaning the
signal is now as louder).

 

 

Excerpt from Spinal Tap the movie (thx Wendy)

 

Nigel: This is a top to a-you know, what we use on stage, but it's very,
very special because if you can see...
Marty: Yeah...
Nigel: The numbers all go to eleven. Look...right across the board.
Marty: Ahh...oh, I see....
Nigel: Eleven...eleven...eleven....
Marty: ..and most of these amps go up to ten....
Nigel: Exactly.
Marty: Does that mean it's...louder? Is it any louder?
Nigel: Well, it's one louder, isn't it? It's not ten. You see, most, most
blokes, you know, will be playing at ten. You're on ten here...all the way
up...all the way up....
Marty: Yeah....
Nigel: ...all the way up. You're on ten on your guitar.. where can you go
from there? Where?
Marty: I don't know....
Nigel: Nowhere. Exactly. What we do is if we need that extra push over the
cliff, you know what we do?
Marty: Put it up to eleven.
Nigel: Eleven. Exactly. One louder.
Marty: Why don't you just make ten louder and make ten be the top number and
make that a little louder?
[pause]
Nigel: These go to eleven.
Received on Thursday, 18 November 2004 18:55:18 UTC