A simple example where BP fails and OMP succeeds

On his fantastic blog, Dirk discusses a simple scenario where BP fails to recover the sparsest solution. His approach makes an overcomplete unit norm dictionary \(\MPhi\) as a matrix containing the standard basis for \(\mathbf{R}^N\) and two vectors that are nearly orthogonal to the measurements \(\vb := \mathbf{1}\), but the span of which contains it, e.g.,

\displaystyle A = \begin{bmatrix} c(1+\epsilon/2) & c(-1+\epsilon/2) & 1 & 0 & \dots & \dots &0\\ c(-1+\epsilon/2) & c(1+\epsilon/2) & 0 & 1 & \ddots & & 0\\ c\epsilon/2 & c\epsilon/2 & \vdots & \ddots& \ddots & \ddots& \vdots\\ \vdots & \vdots & \vdots & & \ddots & \ddots& 0\\ c\epsilon/2 & c\epsilon/2 & 0 & \dots& \dots& 0 & 1 \end{bmatrix}.

where
$$
c = \left [ 2 + \frac{N \epsilon^2}{4} \right ]^{-1/2} = \frac{2}{\sqrt{8+N\epsilon^2}}
$$
to make the first two columns unit norm.


Now the problem is to find the sparsest \(\vx\) such that \(\MPhi\vx = \mathbf{1}\). If \(\epsilon c < 1\), BP passes over the 2-sparse vector \(\vx^*\) because its magnitude element sum (\(\ell_1\)-norm) exceeds that of the representation in the standard basis:
$$
\|\vx^*\|_0 = 2 8/(N – 4)\).
However, where BP fails, OMP can still succeed (and even MP with a debiasing step at the end):

% from Dirk, construct a dictionary
n = 100;
epsilon = sqrt(8/(n^2-n))+0.1;
c = 1/sqrt(2+n*epsilon^2/4);
A = zeros(n,n+2);
A(1:2,1:2) = ([1 -1;-1,1]+epsilon/2)*c;
A(3:n,1:2) = epsilon/2*c;
A(1:n,3:n+2) = eye(n);
b = ones(n,1);
% from me, run OMP two steps
gamma = [];
residual = b;
counter = 1;
for ii=1:2
[~,ind] = max(abs(A'*residual));
gamma = [gamma ind(1)];
residual = b - A(:,gamma(1:counter))* ...
(pinv(A(:,gamma(1:counter)))*b);
counter = counter + 1;
end
norm(residual)
counter = counter - 1;
solOMP = zeros(size(A,2),1);
solOMP(gamma(1:counter)) = pinv(A(:,gamma(1:counter)))*b

After this, we see norm(residual) = 7.6358e-14. So, OMP, with its natural sparsity tuning ability, remains a useful tool in solving inverse problems. However, not so fast!

The epsilon in Dirk’s code makes the first two columns not particularly orthogonal to \(\vb\):

A(:,1:3)'*b
ans =
4.1343
4.1343
1.0000

With such large projections, it is no wonder OMP or MP will select those first two atoms. So, where is the breaking point for OMP?
For the first atom selection, that is going to be when
\(\epsilon^2 < 8/[N(N-1)]\).
Therefore, from this we see that the region in which OMP succeeds (in the first iteration) while BP fails for the whole solution is
$$
\frac{8}{N(N-1)} < \epsilon^2 < \frac{8}{N – 4}.
$$
Both approaches fail for smaller \(\epsilon\),
and BP succeeds for larger \(\epsilon\).
From where is that silly 8 coming?

Now, what about the second atom for OMP?
In order for OMP to choose the correct one
$$
|\left\langle \vb – \langle \vb, \vphi_1\rangle \vphi_1,\vphi_2\right\rangle| > \max_n |\left\langle \vb – \langle \vb, \vphi_1\rangle \vphi_1, \ve_n \right\rangle|
$$
where \(\vphi_2\) is the second column of \(\MPhi\), and \(\ve_n\) is the \(n\)th standard basis element.
The argument of the right hand side reduces to
$$
\left\langle \vb – \langle \vb, \vphi_1\rangle \vphi_1, \ve_n \right\rangle =
1 – \frac{Nc \epsilon}{2} \langle \vphi_1, \ve_n \rangle
$$
The argument of the left hand side becomes
$$
\begin{multline}
\langle \vb,\vphi_2\rangle – \langle \vb, \vphi_1\rangle \langle\vphi_1,\vphi_2 \rangle =
\langle \vb,\vphi_1\rangle [1 – \langle\vphi_1,\vphi_2 \rangle ] \\
= \frac{Nc \epsilon}{2} \left [1 – \frac{Nc^2\epsilon^2}{4} + 2c^2 \right ]
\end{multline}
$$
since \(\langle \vb,\vphi_2\rangle = \langle \vb,\vphi_1\rangle = Nc \epsilon/2\).
After some algebra, we see that the condition for OMP to succeed in the second atom selection is
$$
\left | \frac{Nc \epsilon}{2} \left [ 1 + \frac{8 – N\epsilon^2}{8+N\epsilon^2} \right ]\right | > \max_n \left |1 – \frac{Nc \epsilon}{2} \langle \vphi_1, \ve_n \rangle \right |
$$
or
$$
\left | \frac{Nc \epsilon}{2} \left [ 1 + \frac{8 – N\epsilon^2}{8+N\epsilon^2} \right ]\right | > \max_n \left | \frac{Nc \epsilon}{2} \left [\frac{2}{Nc \epsilon} – \langle \vphi_1, \ve_n \rangle \right ] \right |
$$
or
$$
\left | \frac{16}{8+N\epsilon^2} \right | > \max_n \left | \frac{2}{Nc \epsilon} – \langle \vphi_1, \ve_n \rangle\right |.
$$
The maximum on the right hand side is clearly for \(n = 2\), and so
$$
\frac{16}{8+N\epsilon^2} > \frac{2}{Nc \epsilon} + c(1-\epsilon/2)
$$
since the right hand side is positive when \(\epsilon 2(8+N\epsilon^2)+ 2N\epsilon^2(2-\epsilon)
$$
which becomes even uglier when substitute for \(c^2\) by squaring both sides
$$
\frac{16^2N^2 \epsilon^2 4}{8+N\epsilon^2} > \left [16 + 2N\epsilon^2(1+2N) – 2N^2\epsilon^3\right ]^2
$$
or
$$
\begin{multline}
16^2N^2 \epsilon^2 4> [ 16^2 + 4N^2\epsilon^4(1+2N)^2
+ 4N^4\epsilon^6 + 64N\epsilon^2(1+2N) – \\
64N^2\epsilon^3-8N^3\epsilon^5(1+2N) ] (8+N\epsilon^2)
\end{multline}
$$
which clearly becomes
$$
\begin{multline}
0 > 16^2 2 + 8 N^2(1+2N)(3+2N)\epsilon^4 – 2N^3(9+2N)\epsilon^5 + \\
N^3(4N^2+12N+1)\epsilon^6 + 8N33\epsilon^2 – 2N^2\epsilon^3 + \\
N^5\epsilon^8 – 2N^4\epsilon^7
\end{multline}
$$
and now we have eight solutions… There must be a better way.

OK, how about this.
Since we know for OMP to choose a correct atom in its first iteration
$$
\epsilon^2 > \frac{8}{N(N-1)}.
$$
Is this then a sufficient condition so that the second atom it selects is the other one?
The limit above means for our normalization constant
$$
c^2 < \frac{1}{2} – \frac{1}{2N} 0
$$
which is true if \(N > 1\).
So, it seems that the condition above is sufficient to say OMP selects a correct second atom.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s