I have been thinking about why CMPTK using a MDCT dictionary with Kaiser-Bessel windows produced no increases in energy, unlike all the other dictionaries I have been trying. Maybe it is a confluence of window shape, and real signals being approximated by complex atoms, as well as various approximations being made inside MPTK.
So I tried something. (Warning: the following is a rather rambling record of observations, probably what happens inside Dr. House’s head.)
Let’s try a dictionary of atoms created by modulating a Gaussian window of scale 128 samples, and hopped by 1 sample. I find increases in energy, the largest of which is about 0.6% of the energy. Here is a picture of the residual energy decays for example 1 (attack):
Now let’s try a dictionary of atoms created by modulating a Hann(ing) window of scale 128 samples, and hopped by 1 sample. I find no increases.
Here is a picture of the residual energy decays for example 1 (attack):
The Gaussian dictionary does a little better it seems, even though CMPTK was at times increasing the residual energy.
I try the same thing, but using a Cosine window — no errors. Or a rectangular window — no errors. But these do more poorly than Hanning (except for the sinusoidal signal example 3).
Now, back to the modulated Gaussian windows.
Changing the scale to 256 samples, making the hop 8, but doubling the variance of the window, I get no errors!
Making its FFT size to 512 samples (zero padding), I get NO errors.
Changing its scale to 64 samples, keeping the hop and zero padding, I get lots of errors.
Removing the zero padding, NO errors.
Changing its scale to 130 (not a power of 2), putting zero padding back, lots of errors.
Removing the FFT size option, NO errors.
Changing the variance back to the original — lots of errors.
So it seems that the window shape and/or the zero padding have a significant influence on these errors either separately or together. The errors do not seem to be caused by the phase optimization of the real atoms using complex ones. To explore this further, let’s look at a multiscale dictionary. This one is composed of modulated Gaussian windows of scale/hop: 128/8, 256/8, and 512/16. There are NO errors.
If I decrease the size of the smallest scale to 64, I get errors.
If I double the variance of the windows, I get NO errors.
If I make the windows all Hann(ing), I get NO errors.
If I give the 64-scale window a zeropadding to 256 samples, I get lots of big errors.
Putting the zeropadding on the largest window, to 1024 samples, gives errors.
However, putting the zero padding on the second largest window, out to 512 samples, gives NO errors. (The signal has a length 1024, so I wonder if that is a problem.)
Let’s add more to this multiscale dictionary.
Putting in a Dirac basis gives no errors.
Putting in a window of size/hop 64/8 produces errors.
Changing that scale to 12 produces NO errors.
Changing that to 45, 34, or 24 creates errors.
It seems like when all the atom scales are large enough, errors are less likely.
But sometimes not.
Perhaps the problem is also that my test signals are 1024 samples.
I am trying with the Glockenspiel example, and no errors are being produced…
So, it looks as if CMPTK is implemented correctly, it just has “features.”
Something else is going on that is causing the refinement iteration to skip over the best atom. And that something appears more and more like it has nothing to do with what I did or didn’t do.