*Warning: the following is written in real time during an attempt to prove my instinct correct, but in the end proves wrong. I am humbled.*

I mentioned the other day a curious observation of OMP correcting itself in recovering sparse signals from compressive measurements.

In my experiments, I gauged the success of OMP in recovering 13-sparse signals from 50 measurements (i.e., correlations with random vectors).

My code overlooked stopping OMP at the known sparsity of the vector, and instead let OMP run either 100 iterations, or when the signal residual energy ratio is 100 dB. When I later changed the stopping criterion to 13 iterations I noticed that the recovery rate was significantly reduced.

A commenter here does not find that hard to believe, since OMP involves an orthogonal projection after each iteration.

However, I have the idea that the first 13 atoms selected by OMP must be the correct ones (w.r.t. supports, not amplitudes),

and this is echoed in the proof of the “Exact Recovery Theorem” of J. Tropp, “Greed is good: Algorithmic results for sparse approximation,” IEEE Trans. Info. Theory, vol. 50, pp. 2231-2242, Oct. 2004.

Let’s consider some measurement \(\vx\) in a Hilbert space, and say that we have found through some decomposition process two unit norm atoms, first \((\vx + \vn)/||\vx + \vn||\) where \(||\vn|| > 0\) and \(\langle \vx, \vn \rangle = 0\), and second \(\vx/||\vx||\).

The reprojection step of OMP will find the weights \(\vs\) of the least-squares projection of our signal onto the span of these two atoms, i.e.,

$$\vs = \left ( \left [

\begin{array}{c}

\frac{\vx^H}{||\vx||} \\

\frac{(\vx + \vn)^H}{||\vx + \vn||}

\end{array} \right ] \left [ \frac{\vx}{||\vx||} \; \frac{(\vx + \vn)}{||\vx + \vn||} \right ] \right )^{-1}

\left [

\begin{array}{c}

\frac{\vx^H}{||\vx||} \\

\frac{(\vx + \vn)^H}{||\vx + \vn||}

\end{array} \right ] \vx.$$

Thus, we see

$$\begin{align}

\vs & = \left [

\begin{array}{cc}

1 & \frac{||\vx||}{||\vx + \vn||} \\

\frac{||\vx||}{||\vx + \vn||} & 1

\end{array} \right ]^{-1}

\left [

\begin{array}{c}

1 \\

\frac{||\vx||}{||\vx + \vn||}

\end{array} \right ]

= \frac{1}{1 – \frac{||\vx||^2}{||\vx + \vn||^2}}

\left [

\begin{array}{cc}

1 & -\frac{||\vx||}{||\vx + \vn||} \\

-\frac{||\vx||}{||\vx + \vn||} & 1

\end{array} \right ] \left [

\begin{array}{c}

1 \\

\frac{||\vx||}{||\vx + \vn||}

\end{array} \right ] \\

& =

\frac{1}{1 – \frac{||\vx||^2}{||\vx + \vn||^2}}

\left [ \begin{array}{c}

1 – \frac{||\vx||^2}{||\vx + \vn||^2} \\

0\end{array} \right ] = \left [ \begin{array}{c}

1 \\

0\end{array} \right ]

\end{align}

$$

well holy crap!

It works!

I was thinking that this reprojection would spread the energy over the two vectors … like what happens with the least squares solution in a frame.

If we have a fat matrix of unit norm atoms \(\MH = [\vd_1| \vd_2 | \ldots | \vd_{n-1} | \vx/||\vx|| ]\)

then *a* solution to \(\vx = \MH\vs\) would be

$$\vs = \MH^H(\MH\MH^H)^{-1}\vx.$$

And in this case, the \(n\)th row of \(\vs\) would not be 1 and the rest zeros.

But above, we are not working with a frame, or even a basis for the entire space.

Of course the orthogonal projection must lie only along \(\vx/||\vx||\).

Thus, it appears entirely possible that OMP can “correct” itself until our skinny set of atoms is no longer linearly independent — which will never happen in OMP because of the orthogonal projection.

And this is perhaps also the explanation for why PrOMP performs better than OMP — it considers more than one atom as an initial starting point. As long as OMP has the 13 correct atoms in its 50+ selected atoms, we will have perfect recovery. (Offer void where prohibited, especially in cases that involve noise.)