Re: Simplifying specing/testing/implementation work

Citerar Jussi Kalliokoski <jussi.kalliokoski@gmail.com>:

> Obviously SIMD code is faster than addition in JS now, for example. And
> yes, IIR filter is a type of a convolution, but I don't think it's possible
> to write an efficient IIR filter algorithm using a convolution engine --
> after all, a convolution engine should be designed to deal with a FIRs.

You can split the IIR filter into two parts: an FIR part and a  
recursive part. If you skip the recursive part, you get plain  
convolution.

> Not to mention that common IIR filters have 4 (LP, HP, BP, N) kernels,

I was thinking more along the lines of using the more generic form of  
the filter function (see the Matlab filter() function [1]), which  
supports everything from first order IIR filters to FIR-filters with  
1000s of taps (or more, obviously).

> And as far as IIR filter performance goes, I think SIMD
> instructions offer very little usefulness in IIR algorithms, since they're
> so linear.

Actually, I made a simple benchmark on my AMD x2, 64-bit Linux, using  
an IIR filter function identical to the filter function in the  
BiquadFilterNode (excluding buffer edge handling).

JavaScript code:

function filter (y, x, b, a) {
   var b0 = b[0], b1 = b[1], b2 = b[2];
   var a1 = a[0], a2 = a[2];
   var len = x.length;
   for (var k = 2; k < len; ++k)
     y[k] = b0 * x[k] + b1 * x[k-1] + b2 * x[k-2] - a1 * y[k-1] - a2 * y[k-2];
}

C++ code:

void filter(float* y, const float* x, unsigned len, const float* b,  
const float* a)
{
   float b0 = b[0], b1 = b[1], b2 = b[2];
   float a1 = a[0], a2 = a[2];

   // FIR part
   for (unsigned k = 2; k < len; ++k)
     y[k] = b0 * x[k] + b1 * x[k-1] + b2 * x[k-2];

   // Recursive part
   for (unsigned k = 2; k < len; ++k)
     y[k] -= a1 * y[k-1] + a2 * y[k-2];
}

Note: In the C++ version I split the FIR and recursive parts into two  
separate loops because somehow g++ managed to optimize that better  
than doing a single loop (looking at the assembler output, it actually  
SIMD:ified the FIR part for me - impressive!). Also, it illustrates  
that you can do the FIR part first, with optimizations (e.g. FFT-based  
convolution or SIMD), and then append the recursive part as a second  
pass.

Benchmark results (using 128 samples long buffers):

    C++: 529.4 Ms/s
Chrome:  22.0 Ms/s
     FF:  13.0 Ms/s
  Opera:   9.0 Ms/s

So, we actually get a >20x speedup when going from the fastest JS  
engine to native C++ code (g++ -O3 -ffast-math). I think that's  
motivation enough to implement it as a native function ;)

Regards,

   Marcus


[1] http://www.mathworks.se/help/techdoc/ref/filter.html

--
Marcus Geelnard
Core Graphics Developer
Opera Software ASA

Received on Thursday, 19 July 2012 21:12:23 UTC