I have solved the mystery that has pushed me for the past week into excruciatingly fun debugging sessions. Yes, I know I mentioned on June 9 that CMP was extremely easy to implement in MPTK. Then came second thoughts as to the behavior of the implementation. And there followed more observations, and rambling observations, and then the videos appeared. And then the music video appeared. Well, now here’s another: ex1_MP_atoms_solved.mov

In decomposing a real signal using MP and a dictionary of complex atoms, at each step we seek for the best atom in the sense that it will remove the most energy from the intermediate residual signal without taking into account any orthogonal projection after selection (otherwise MP becomes OOMP, or OLS, or greedy forward selection, etc.).

The fact that we have a real signal is no problem for our complex dictionary.

We can make any real atom by a convenient use of the complex conjugate.

Hence, we can construct a real atom \(\vd\) from a unit norm complex one \(\vg\) and its conjugate \(\vg^*\)

$$

\vd = \frac{a}{2} \left [ e^{i\phi} \vg + e^{-i\phi} \vg^* \right ].

$$

Given a real signal \(\vx\), we can adjust the amplitude \(a\) and phase \(\phi\) to construct the best real atom in the sense that

$$

\min_{a,\phi} || \vx – \frac{a}{2} \left [ e^{i\phi} \vg + e^{-i\phi} \vg^* \right ] ||_2.

$$

If we define \(\MG = [ \vg | \vg^*]\), and \(\vz^T = a/2 [e^{i\phi}, e^{-i\phi}]\),

then this becomes

$$

\min_{\vz} || \vx – \MG\vz ||_2

$$

which is a least squares problem and has the solution

$$

\vz_o = (\MG^H\MG)^{-1}\MG^H\vx

$$

as long as \(\vg\) is a complex atom, i.e., \(|\vg^H\vg^*| < 1+\epsilon\),

where \(\epsilon\) allows for the fact that machines do not have infinite precision.

Note too that \(\MG\vz_o\) is the orthogonal projection of \(\vx\) onto the span of \(\vg\) and its conjugate.

The columns of the matrix \(\MG(\MG^H\MG)^{-1}\) actually form the “dual basis”, or “biorthogonal basis”, to the columns in \(\MG\).

Since the Gramian is just a 2×2 matrix, we can invert is easily and obtain these dual vectors:

$$

\vh = \frac{\vg – \langle \vg, \vg^* \rangle \vg^* }{1 – | \langle \vg, \vg^* \rangle|^2}

$$

the other being the conjugate.

Note that though \(||\vg||_2 = 1\), the dual may not have the same norm.

What is cool about dual bases is that we can project on either, but we must reconstruct on the other.

So, considering that we have a complex unit norm atom from the dictionary,

and that we want to find the best real atom from it given a signal,

we project \(\vx\) onto the unit norm atom and its conjugate and build back with the dual:

$$

\frac{1}{\gamma} \left ( \langle \vx, \vg \rangle \left [ \vg – \langle \vg, \vg^* \rangle \vg^*\right ]

+ \langle \vx, \vg^* \rangle \left [ \vg^* – \langle \vg^*, \vg \rangle \vg\right ] \right )

= \frac{a}{2} \left [ e^{i\phi} \vg + e^{-i\phi} \vg^* \right ]

$$

where \(\gamma := 1 – | \langle \vg, \vg^* \rangle|^2\).

The problem now is to find \(a\) and \(\phi\).

If we make the following definitions:

$$

\begin{align}

x & := \langle \vx, \vg \rangle \\

r & := \langle \vg, \vg^* \rangle

\end{align}

$$

then the above becomes

$$

x\vg + x^*r^*\vg – c r\vg^* + x^*\vg^* = \frac{a\gamma}{2} \left (\cos \phi + i \sin \phi\right )\vg + \frac{a\gamma}{2} \left (\cos \phi – i \sin \phi\right )\vg^*.

$$

Now, one may think, as I first did, that we can group and separate the terms on each side multiplying \(\vg\), and then solve for \(a\) and \(\phi\).

But unless \(\vg^*\) is orthogonal to \(\vg\), that is not a good thing to do.

Instead, multiply each side by \(\vg^H\):

$$

x + x^*r^* – c r^2 + x^*r = \frac{a\gamma}{2} \left (\cos \phi + i \sin \phi\right ) + \frac{a\gamma}{2} \left (\cos \phi – i \sin \phi\right )r.

$$

since \(\vg^H\vg = 1\), and \(\vg^H\vg^* = r^*\).

Now we can group things based on real and imaginary because those components are definitely orthogonal.

We thus obtain the two equations

$$

\begin{align}

\text{Real}\{x + x^*r^* – c r^2 + x^*r\} & = \frac{a\gamma}{2} \left ( 1 + r \right ) \cos \phi \\

\text{Imag}\{x + x^*r^* – c r^2 + x^*r\} & = \frac{a\gamma}{2} \left (1 – r \right ) \sin \phi.

\end{align}

$$

Now we can solve for \(a\) and \(\phi\) quite simply.

What I noticed in the code of MPTK was that \(\gamma\) was missing from the computation of the amplitudes.

When I inserted back in, everything worked as it should!

But, as I predicted, its absence was absolutely and incredibly subtle.

For most atoms with no negligible imaginary part, \(1 – | \langle \vg, \vg^* \rangle|^2 \approx 1\).

This is especially true when decomposing audio signals because only

atoms with modulation frequencies close to zero and Nyquist will have a \(\gamma\) that is not approximated by 1.

For my length-64 Gaussian-windowed atoms with a modulation frequency 1/64 or 31/64

\(\gamma = 0.8038\).

When we go to a modulation frequency 2/64, or 30/64, this becomes \(\gamma = 0.9985\).

For length-512 atoms, atoms with a modulation frequency 1/512 or 255/512 have

\(\gamma = 0.7951\).

When we go to a modulation frequency 2/512, or 254/512, this becomes \(\gamma = 0.9982\).

And for length-16384 atoms, atoms with a modulation frequency 1/16384 or 8191/16384 have

\(\gamma = 0.7939\).

When we go to a modulation frequency 2/16384, or 8190/16384, this becomes \(\gamma = 0.9982\).

This explains why, no matter how large I scaled my atoms, I was only seeing this effect on the second and the penultimate frequency indexes.

However, the effect appears to be large.

Here are some new plots.

Below is the decomposition of the attack signal using a single scale dictionary.

Compare these with those.

Tomorrow I will run the glock examples again.

What a great way to end my time here!

Just to be more specific, what needs to change is in gabor_block_plugin.cpp, the line in the create_atom method is:

amp = 2.0 * sqrt( real*real + imag*imag );

but it should be

amp = sqrt( real*real + imag*imag ) * cstCorrel[freqIdx];