W3C home > Mailing lists > Public > w3c-wai-gl@w3.org > October to December 2004

RE: Audio Contrast Testability issues.

From: David MacDonald <befree@magma.ca>
Date: Thu, 18 Nov 2004 16:28:33 -0500
Message-Id: <200411182128.iAILSYY7008407@mail3.magma.ca>
To: "'Gregg Vanderheiden'" <gv@trace.wisc.edu>, <w3c-wai-gl@w3.org>
Hi Gregg


The space between words would be compressed because of the decay time on the
compressor is usually about 50-200ms. But this is not entirely bad because
if the sample between words is compressed at the same rate as the
compression during words then you have an "apples to apples" comparison.
Which would make a valid test. So you would want the sample to occur
immediately after a word. Say 5 -10ms, before the compression rate decay
tail falls too far.


Compression kicks in pretty quickly in the industry, usually between 5-20ms
and has a hold of about 20ms a fairly slow decay rate of about 50-200ms. 








From: Gregg Vanderheiden [mailto:gv@trace.wisc.edu] 
Sent: Thursday, November 18, 2004 4:01 PM
To: 'David MacDonald'; w3c-wai-gl@w3.org
Subject: RE: Audio Contrast Testability issues.


Thanks David,

Very interesting.


How quickly does this occur.  Usually it takes time before this happens.  If
you sample between words would you see this?




 -- ------------------------------ 
Gregg C Vanderheiden Ph.D. 
Professor - Ind. Engr. & BioMed Engr.
Director - Trace R & D Center 
University of Wisconsin-Madison 


From: w3c-wai-gl-request@w3.org [mailto:w3c-wai-gl-request@w3.org] On Behalf
Of David MacDonald
Sent: Thursday, November 18, 2004 12:55 PM
To: w3c-wai-gl@w3.org
Subject: Audio Contrast Testability issues.


On last week's call I provided my findings and suggestions on Guideline 1.4,
particularly as it applied to audio contrast. The question of testability
came up. I mentioned that it would be difficult to test it. Gregg suggested
a technique of sampling the track as follows:


1)       Measure the volume of the track (in DB's) at a point where no one
is speaking (background only)

2)       Measure the volume when someone is speaking over the background

3)       Compare the two measurements to ensure there is at least a 20db
difference between the two samples


I have no problem with us presenting this as a technique for measuring audio
contrast. There are however some serious considerations. I think we would
have to set the following conditions:


1)       The audio background would need to be at a similar volume in both

2)       There cannot be any compression/expansion applied to the track.
(Currently, media uses compression which would skew the results of the


The practical effect of compression is that the background is actually
louder when there is no talking than when there is talking. When there is
talking the entire recording (background and talking) is squashed down. So
under these conditions the background would rarely be the same volume for
the sample taken when there is no speaking and the sample taken when there
is speaking. However, the recording may be perfectly accessible because when
the person is talking the background is squashed and when the person is
silent the background is expanded. Perhaps I could work out an algorithm
which compensates for compression and would not skew results.


Additional info: 

Here's a simplified explanation of the way compression works. (I'll leave
out peak limiting for now which is a special case of compression.) Every
radio station, TV station, (and musician) wants their signal to be as loud
as possible over the airwaves. They want to use up all the headroom. The way
they make that happen is to apply compression to the final mastering of
their recordings and then boost the overall compressed signal. Compression
marks a threshold, say of -15db. Every part of the recording above the
threshold will be squashed at a predetermined ratio, say 2:1. In this
example every part of the signal over -15db is half as quiet as it would
normally be. This squashes the loud parts of the recording. Then the overall
signal is boosted so the entire track fills up the headroom (meaning the
signal is now as louder).



Excerpt from Spinal Tap the movie (thx Wendy)


Nigel: This is a top to a-you know, what we use on stage, but it's very,
very special because if you can see...
Marty: Yeah...
Nigel: The numbers all go to eleven. Look...right across the board.
Marty: Ahh...oh, I see....
Nigel: Eleven...eleven...eleven....
Marty: ..and most of these amps go up to ten....
Nigel: Exactly.
Marty: Does that mean it's...louder? Is it any louder?
Nigel: Well, it's one louder, isn't it? It's not ten. You see, most, most
blokes, you know, will be playing at ten. You're on ten here...all the way
up...all the way up....
Marty: Yeah....
Nigel: ...all the way up. You're on ten on your guitar.. where can you go
from there? Where?
Marty: I don't know....
Nigel: Nowhere. Exactly. What we do is if we need that extra push over the
cliff, you know what we do?
Marty: Put it up to eleven.
Nigel: Eleven. Exactly. One louder.
Marty: Why don't you just make ten louder and make ten be the top number and
make that a little louder?
Nigel: These go to eleven.



Received on Thursday, 18 November 2004 21:29:44 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 16 January 2018 15:33:51 UTC