Re: About AudioPannerNode from Chris Rogers on 2012-06-25 (public-audio@w3.org from April to June 2012)

From: Chris Rogers <crogers@google.com>
Date: Mon, 25 Jun 2012 12:20:03 -0700
To: Marcus Geelnard <mage@opera.com>
Cc: public-audio@w3.org
Message-ID: <CA+EzO0=8GJNObv=Cx4PrJDE3XSsAQUE0otuAiDzvwBCVPurdUA@mail.gmail.com>

On Thu, Jun 21, 2012 at 12:06 AM, Marcus Geelnard <mage@opera.com> wrote:

> Den 2012-06-19 19:39:53 skrev Chris Rogers <crogers@google.com>:
>
>  On Tue, Jun 19, 2012 at 2:50 AM, Marcus Geelnard <mage@opera.com> wrote:
>>
>
>
>
>  Lastly, how should complex spatialization models (thinking about HRTF
>>> here) be handled (should they even be supported)? I fear that a fair
>>> amount
>>> of spec:ing and testing must be done to support this, not to mention that
>>> HRTF in general relies on data files from real-world measurements (should
>>> these be shared among implementations or not?
>>>
>>
>>
>> I'm happy to share the measured HRTF files that we use in WebKit.  I'm not
>> sure if they should be normative or not...
>>
>
> I think we need to decide how strict the spec should be here. I generally
> prefer a model where the spec describes an optimal algorithm, and if
> required it can allow for deviations from that optimal solution to a
> certain, well-defined degree (in whatever terms are suitable for the
> algorithm at hand).
>
> I also like to think in terms of testing. For instance, it would be
> impossible to write a useful test that verifies a loose statement such as
> "must create the impression of positioning the input signal at the given 3D
> position, relative to the listener". On the other hand, it would be much
> easier to test against a normative HRTF data set, since the resulting
> signal should be accurate to (almost) floating point precision.
>
> Here's where I can't really decide which way is better. I'm not sure if
> there are any strong use-cases for allowing implementations to use
> different data sets.
>

I'm open to either way.  I'm not sure if this would come up in practice,
but one case for allowing different data sets is if an implementation had
access to OS-level services providing this spatialization.


>
> Related questions:
>
> How big is the data set?
>

It's not too big - about 250K, and with some work could be make even
smaller.


>
> To what degree can you modify it (e.g. crop impulse responses or reduce
> the angular resolution) without negatively affecting the 3D effect too
> much? (has this been experimented with?)
>

I've played around a little bit.  Currently we're using a length of 256
sample-frames @44.1KHz (already cropped from 512).  I think they can be
cropped down to 128 and still retain much of the character, but at slightly
lower quality.  We've been getting good enough performance with the 256
length that it hasn't been necessary for us.


>
> Have you considered/compared against other available HRTF measurements,
> and if so, why were they not chosen?


Yes, I've also played around with the MIT Kemar measurements (recorded from
a dummy head), but found them to be of lower quality:
http://sound.media.mit.edu/resources/KEMAR.html

And also with the CIPIC database, which are ok:
http://interface.cipic.ucdavis.edu/sound/hrtf.html

But I prefer the IRCAM/AKG ones:
http://recherche.ircam.fr/equipes/salles/listen/

Chris




>
>
> /Marcus
>
>
>
>
> --
> Marcus Geelnard
> Core Graphics Developer
> Opera Software ASA
>

Received on Monday, 25 June 2012 19:20:42 UTC