Some Experiments with Glockenspeil

Today I have been experimenting with CMPTK and a real audio signal.
With this larger signal, the energy errors by which I have been plagued this last week seem to be much more rare.

Below we see the residual energy decay of this example with MP and CMPTK using a dictionary of Gabor atoms (Gaussian window) of only two scales: 128/32 and 4906/64/8192 (scale/hop/FFTsize if different from scale).
I run 200 iterations.
CMP-\(l\) is implemented such that all representations at each order undergo at least one cycle. When \(l = 5\), more refinement cycles can be performed until the ratio of residual energies before and after a cycle is less than 1.002, or less than about 0.009 dB.
I also plot in this graph, the “cycle energy decrease,” which is the ratio of the residual energy before and after the entire refinement at the iteration.
We find a few large spikes of improvement.
At the end of 200 iterations, the models produced by CMP have an error 2.2 dB better than that produced by MP.


glock2_energydecay_double.png
Below, I show the time-domain residual signals resulting from both MP and CMP-1.
For the most part, the CMP-1 error signal is below that of MP; but strangely the first attack causes more problems for CMP-1 than the other attacks.

glock2_errorsdouble.png
Let’s have a listen to the sounds.
Here is the residual due to MP;
and here is the residual due to CMP-1.
I can’t really tell much difference, except the CMP-1 residual is a bit quieter.
I don’t hear any significant differences in the pre-echos of the attacks,
for which I was candidly hoping.
But if we take a closer look at how the attacks are being modeled by the
shorter atoms, we see some promising results.
Below I show each resynthesis aligned to the original signal using only the atoms of scale 128 samples (which is 6 ms at this sampling rate 22.05 kHz).
For the MP decomposition, 109 atoms out of 200 fit this description.
For both CMP decompositions, 105 atoms fit this description.
Except for the third, the attacks modeled by CMP look more cleanly synthesized than that of MP — especially the second attack, which appears delayed by MP.

glock2_attacks.png
Now, what if we do not use the condition that the refinement process can end if there is no significant reduction in residual energy?
Using the same dictionary as above,
the figure below shows residual energy decay of this example with MP and CMPTK
with one or two refinement cycles (5 will take too long, but I might run it overnight).
This means that MP will have 200 atom selections and 200 subtractions,
and CMP-1 will have 20,100 atom selections and 40,200 additions, and CMP-2 will have 40,200 atom selections and 80,400 additions.
Compared with the figure above, I don’t see any benefit to forcing a certain number of refinement cycles — which is a good thing for reducing the computational complexity.
A look at the residual signals confirms this.
(I will not run \(l=5\) overnight.)

glock2_energydecay_doublef.png
Now, we move on to a richer dictionary. Below we see the residual energy decay of this example with MP and CMPTK for a dictionary of Gabor atoms (Gaussian window) of eight scales: 32/8, 128/32, 256/64, 512/128, 1024/128, 2048/256, 4096/512/8192, 8192/1024/16384 (scale/hop/FFTsize if different from scale).
For these examples, I have kept the rule that at least one refinement cycle is required, but additional ones will be performed as long as the ratio of residual energies before and after a cycle is greater than 1.002, or greater than about 0.009 dB.
Compared to that of the two-scale dictionary above,
we see a better decay of the residual energy, but there appears to be less maximum improvement with the cycles.
Here the values extend up to nearly 0.15 dB; but for the two-scale dictionary the max improvement is twice that.
Just by eyeballing the two though, I think the mean improvements are about the same.

glock21_energydecay_multi.png
Below, I show the time-domain residual signals resulting from both MP and CMP-1 with this multiscale dictionary.
For the most part, the CMP-1 error signal is below that of MP. Both methods appear to have the same problem with the first attack.

glock2_errors_multi.png
Let’s have a listen to the sounds.
Here is the residual due to MP;
and the residual due to CMP-1.
Again, I can’t really hear the 1 dB difference.

Anyhow, the take home messages from all my weekend experiments appear to be these:

  1. Globally, CMP does not appear to improve a signal model enough over that of MP to warrant its significantly higher computational complexity.
  2. We need to employ a much better strategy at localized levels to avoid as much as possible this additional overhead to have the greatest gains.
  3. The architecture of CMP remains an attractive alternative to that of MP and OMP,
    where once an atom is selected, it remains a part of the model forever.
  4. Its cyclic application of a simple procedure also permits the application
    of more complex criteria for atom replacement, such as perceptual weightings,
    and “dark energy” (my favorite!)
  5. Thus, we must augment CMP with localized considerations;
    and I believe I can see at least a dozen variations.
  6. With which one should I begin?
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s