[Yoshimi-devel] Unfriendly encounter with Wavetable Interpolation

Ichthyostega Sat, 05 Mar 2022 10:26:01 -0800


Hello Yoshimi-devs,

last days I made a rather nasty observation --
unfortunately without reaching any conclusive results.

It started when I noticed a small numeric difference between the waveform
as generated before the "padthread"-refactoring. While in itself, this
difference is certainly too small to be noticeable (-70dB), I still wanted
to find out about the reason, because the "padthrad" branch brings many rather
deep refactorings regarding memory management, and moreover I attempted to
make the usage of overtone index numbers coherent, and thus I could have
introduced a subtle bug somewhere.


So I drilled into the most simple test showing those differences: BasicADD.test

I applied my changeset for dumping intermediary computation results, and it was
immediately obvious, that the differences seem to emerge from our old friend,
the AnalogFilter. However, I did /not/ change anything of relevance there,
and so I added detailed dumping of the Filter's internal pipelines.

And this showed, that a smallish difference is injected into the filter's
*input* line, intermittently and irregularly (but totally reproducible).
After about 5 buffers of computation, those differences have accumulated
sufficiently within the filter's feedback line, to become noticeable on
the global output, and over time, the differences build up more and more.

Now this observation raised some concerns, since the input of the Filter
must be drawn from the wavetable, right? However a complete dump of the
generated spectra and wavetables showed absolutely no difference -- which
in itself is comforting and resolves some of my apprehension.

But what causes those damn differences then? The only relevant part in between
is the *interpolation* applied when reading the wavetable. This is a piece of
code where the computation spends a considerable fraction of the overall synth
generation time, and it is highly optimised.

int    poshi  = oscposhi[nvoice][k];
int    poslo  = oscposlo[nvoice][k] * (1<<24);
int    freqhi = oscfreqhi[nvoice][k];
int    freqlo = oscfreqlo[nvoice][k] * (1<<24);
float *smps   = NoteVoicePar[nvoice].OscilSmp;
float *tw     = tmpwave_unison[k];
for (int i = 0; i < synth->sent_buffersize; ++i)
{
    tw[i]  = (smps[poshi] * ((1<<24) - poslo)
           +  smps[poshi + 1] * poslo)
           / (1.0f*(1<<24));
    poslo += freqlo;
    poshi += freqhi + (poslo>>24);
    poslo &= 0xffffff;
    poshi &= synth->oscilsize - 1;
}
oscposhi[nvoice][k] = poshi;
oscposlo[nvoice][k] = poslo/(1.0f*(1<<24));


As you can see, the distance between samples in the wavetable is quantised
to an integer with 24 bit, which is reasonable, since the floating point
mantissa is known to have a maximum resolution between 23 and 24 bits:

 * the smallest float number different from 1.0 is 1.0 + 2^-23
 * the largest  float number different from 1.0 is 1.0 - 2^-24

Btw, 2^-23 / 1.0 corresponds to an attenuation of -138dB (FS)

So this is the finest step, which can be represented on a waveform
rendered in float at maximum amplitude (the resolution for smaller
values is finer, since they are represented using a smaller exponent,
thus the step at maximum amplitude is the weak spot of float samples).


And indeed, it turned out, the code quoted above produces intermittent
numeric glitches, when compiled with optimisation. And those glithes
are *much larger than at the least bit*. I saw flips of the 5th and the
6th last bit of the Mantissa.

Here I used the "forThisCPU" setting, which translates into

-O3 -march=native -mtune=native


NOTE: this setting does not use --fast-math
and thus the reason must be the well known "impedance mismatch" between
the normal processor floating point engine and the SSE extensions. At least
that is my conclusion.


Now, what triggered those differences?
The code isn't directly changed by "padthread".
But the meaning of the access operator was changed.

 * in the old code, tw[i] and smps[poshi] did an indexed access via float*
 * in the new code, these are overloaded inline operators

And, seemingly, here the introduction of that changed memory access prevented
the more aggressive Optimisation by the compiler. The generated assembly in
fact looks quite different (while understanding the details beyond some
landmarks surpasses my knowledge of assembly and CPU internals)

However, I have verified the numbers at several incidents of the difference.
Both with a standalone C++ program with floating point nubmers, and with
the calculator desktop application "speedcrunch".

In all cases, the new code produced more accurate numbers. However, when
I add dumping of intermediary results, both the old and the new code produce
identical (and more accurate) results. So this thing qualifies as "Heisenbug"


So what can we do?
Nothing, it seems!

We are at the mercy of the compilers/optimisers + the innards of the CPU,
which just happen to flip some minor bits in the mantissa if they feel like it.

Whew. In the and we should call ourselves happy when all we have to worry about
are some minor bits in the Mantissa within sound synthesis table interpolation.

-- Hermann




_______________________________________________
Yoshimi-devel mailing list
Yoshimi-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/yoshimi-devel

[Yoshimi-devel] Unfriendly encounter with Wavetable Interpolation

Reply via email to