Acoustic visualization and post-process : request for feedback and use cases

timothee.chabat · June 3, 2022, 1:30pm

Ok so currently there is indeed several problems :

documentation is not enough explenatory and ill-formed, this is why there only documentation for getters.
the AverageFft parameter was never set, resulting in ignoring the BlockSize and BlockNumber parameters.

I’ll soon fix this so it can work as expected (see explanation below).

Let me explain the options a bit, so you can tell me if some names should be changed or if parameters should be added/removed to the API :

OptimizeForRealInput : basically it switches between using scipy.signals.fft and scipy.signals.rfft. This has an incidence on the output size (output size for real input is ~ half the size).
WindowingFunction : the windowing kernel to use. A kernel of the size of the input signal is generated and applied before doing the FFT.
Average FFT per block : when off the filter computes a single FFT on the whole signal. If the signal is big this can take some time. To allow for a bit less precise FFT but much shorter computation time, one can average the FFT turning this option On. The algorithm when averaging the FFT is controlled by two other parameters : NumberOfBlocks and BlockSize. The algorithm is doing the following :
1. Create a windowing kernel of size BlockSize (or half of it if OptimizeForRealInput is on)
2. Extract the first BlockSize samples of the input signal
3. Apply the windowing kernel and then the FFT to it. The resulting FFT is of size BlockSize (or half of it if OptimizeForRealInput is on)
4. Store this FFT in a buffer
5. Repeat the process NumberOfBlocks times from step 2. but shifting the starting point when extracting the samples. The idea is that the block is “sliding” from start to finish. See below for an explanation of how the shift occurs. Green is the input signal and red are the block extracted
6. Once all blocks are extracted, average them all and return this result.

Normalize : this parameter has 2 effects. First it transform the imaginary output of the FFT as a real output representing of the norm of all imaginary values and also divide it by the Windowing kernel energy. Second it removes the mean signal of each block before doing the FFT. For example on a pure sinusoidal signal the mean is 0 so it will have no effect.

The overall algorithm has been crafted with people from the acoustic world, maybe it is different from welch algorithm but it should still be usable (once I correct the bugs ).