[whatwg] Volume and Mute feedback on <video>

On Sat, Aug 21, 2010 at 10:38 AM, Silvia Pfeiffer <silviapfeiffer1 at gmail.com
> wrote:

> On Sat, Aug 21, 2010 at 5:57 AM, Ian Hickson <ian at hixie.ch> wrote:
>
>>
>>
>> On Thu, 10 Jun 2010, Silvia Pfeiffer wrote:
>> >
>> > That requires editing the resource. Think about it from a process
>> > point-of-view: you're a Web developer and have been given a set of media
>> > resources to put on a Website. As you put it all together, you notice
>> that
>> > the volume of the different files is different and thus playing them
>> back
>> > next to each other will create a very confusing user experience. Do you
>> > really want to shoot the files back to the production to adjust the
>> volume
>> > settings so they are all similar? If you're under time pressure, you'd
>> > probably much prefer just setting a volume attribute on each so they all
>> > play back with the same level.
>>
>> What if you notice that each file uses different fonts for titles, or each
>> video is colour-corrected differently, or uses a different lens, or has a
>> different aspect ratio, or four are filmed during the day and one during
>> the night and the latter one really stands out in a bad way? I don't think
>> we should assume that just because we can do post-processing in the
>> client, it's the right thing to do. :-)
>>
>
>
> I really wouldn't classify volume change as part of "video editing". My TV
> remote has a volume up and down button that allows me to increase the volume
> beyond what the video was originally encoded in. Do we really want to refuse
> such a simple functionality to both users and web developers?
>
>

After a brief discussion on irc, I want to address the volume, volume
setting, and volume normalization issue properly.

@volume is currently an attribute that takes values from 0 to 1, where 1
means to play the volume at which the media resource was created and define
that as 0dB. Thus, @volume isn't actually expressing what users generally
understand under volume, namely to be able to play back the resource at its
original level (the level that it was before it got recorded) and be able to
manipulate that level up or down. Instead, our @volume expresses relative
attenuation and we are only able to manipulate the gain down and not up
above what it is stored at.

If we lived in an optimal world, all audio resources would be normalized to
the same reference range and that range would be given as a perceived
loudness level (see http://en.wikipedia.org/wiki/Loudness). Then we would be
able to use the exact same setting for all our audio resources and always
get them at a level that we can rely on. We would actually display @volume
as a value between 0 and 1 where 0 is absence of sound and 1 is the loudest
that a human ear can bear without bursting (or even a bit louder than that)
and we would be able to represent each audio recording with its exact
perceived loudness on that scale, which is identical to what it was recorded
at. I believe this would be the optimal solution for a user wrt volume.

Even if we don't use loudness as a measure, a better situation would already
be where we have audio resources follow a normalised sound pressure level
range. It would be simple to map a encoded value of x to a fixed sound
pressure in Pascal. Instead, the audio world always deals in relative
values, namely in dB. And unfortunately most of the time what is 0dB for a
digital file 1 is not the same perceived loudness as what is 0dB for file 2
- maybe because the microphone was bad, maybe because the mixer was badly
set up, maybe because the recording settings on the computer were screwed,
maybe because transcoding settings were screwed - there are a gazillion
reasons. The fact is: this is reality and we have to deal with it.

On TV and Radio, we have a world that has somewhat managed to deal with this
situation. When we continue to listen to a single radio station, we expect
the music pieces to all be played back at approximately the same perceived
loudness, and the same on TV. (Yes, they manipulate it sometimes to make,
e.g. advertising louder, but that is conscious manipulation of users and not
an inherent problem). This is a big challenge for the TV and radio stations,
but they generally manage to stay fairly consistent within themselves. This
is because they cannot expect the user to continuously have to change their
volume settings on their radio or TV station just to be able to keep the
sound within a comfortable range.

On the Web no such consistency is available. And with the current way in
which audio and video work it's not even possible to create such a
consistency within a single Web page.

There are actually two issues at hand here.

1. Amplification

Firstly, it's the problem that audio and video files are not encoded with
the same reference sound pressure, resulting in files that are extremely
loud at the @volume=1 setting, while others are almost imperceptible even at
@volume=1. We can deal with the first situation: we can turn the knob down
on such a file. We can, however, not deal with the second situation. We have
no way right now to deal the know up and amplify the sound pressure beyond
what its maximum setting is. I believe the reason for this is that
amplification can cause artifacts and that's acceptable.

We can of course get out of this by introducing an additional attribute that
lets us amplify the sound pressure level of the resource (something like a
preamp). But that's not really that accessible to the user.

Or, if it was possible, we could even introduce a @normalize attribute that
would normalize the @volume range to a loudness range within human
perception. The normalization, however, has to deal with lost information,
namely that the maximum sound pressure of the original sound isn't available
any more, and thus has to make some assumptions. Trying to do this on a
progressively downloading resource will lead to constantly changing volume
ranges, so it's not really practical.

What is most practical is actually to allow the @volume to have higher
settings than 1 and to set the slider 1 for the loaded resource. Anything
higher than 1 is amplification beyond the resource's original gain, anything
lower just what it is today. Obviously, the question is, what value do you
stop at. iTunes takes it amplification up to +12dB. Maybe that can be mapped
to "2" and then the increase be done logarithmically. Some value has to be
picked - unless we can introduce a slider that dynamically increases its
upper level as users keep hitting it.


2. Web author adjustment

It's this second issue that I was originally pointing out, even though I got
side-tracked with the much bigger problem of loudness.

As a Web page author you are basically in the same position as a radio
channel or a TV station: you want to publish all the video or audio files at
the same loudness so that a user can make the volume settings on their
computer once and not have to make any more changes for listening to more of
your content. This is particularly important if, e.g., you have a playlist
of videos and they play one after the other (as a program), or you have all
the videos displayed on the page for people to click on and, say, watch in a
lightbox.

Most of the time, you are just the publisher of content and not the author
of the content, so you will likely not be able to go back into a studio with
the file and make adjustments. Think, for example, of a online radio station
that gets user created content sent in to publish, but also Grandma Peters
who wants to put all the videos of her grandchildren onto a Web page.

Assuming they will listen into the piece before publishing, they will
determine what volume adjustment would fit with the standard of their other
media resources. It would be nice to just be able to remember this setting
as the initial setting for when people load this resource. It would be
simple to satisfy this need by just exposing the @volume attribute as a
content attribute.

Another example is a Web page that has music playing back as constant
background, but allows you to click on talks (e.g. a list of presentations)
and the presentation will play in parallel to the music. You'd want the
music always to play back more quietly, so setting an initial @volume on the
<audio> element would totally make sense. It's very much parallel to what
@opacity means to visual content.


Uff, this got longer than expected...

Cheers,
Silvia.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20100821/c15f1238/attachment-0001.htm>

Received on Friday, 20 August 2010 23:28:07 UTC