Say I have a discrete signal of length \(K\), which we assume to be a perfect square.

And assume its sparsity is \(K\) in a Dirac basis.

This means the signal is not sparse at all with respect to the time domain.

Assume as well that the sparsity of its Fourier transform is also \(K\),

which means that the signal is not sparse at all with respect to the frequency domain.

**Question 1:**

Do such signals exist?

Of course they do! As one example, sample a white iid random process. One such sequence is below.

For a sequence of \(K\) iid uniform random variables \(x_i \in [-1, 1]\), the probability that all of its elements have magnitudes greater than some \(\epsilon > 0\) is

$$P\{| x_1 | \ge \epsilon, | x_2 | \ge \epsilon, \ldots, | x_K | \ge \epsilon\} = \left ( 1 – \epsilon \right )^K$$

which is pretty much 1 for extremely small \(\epsilon\) until \(K\) goes berserk. And by the power of the power spectrum of a uniformly distributed random sequence, the probability that it can be described in the frequency domain with fewer than \(K\) functions is vanishingly small. So, white noise is equally horribly decomposable in both time and frequency.

**Question 2:**

Can I expect to describe such a signal with fewer than \(K\) functions from this dictionary? In other words, does the union of the these two bases help for such signals?

While I can’t find an answer to this question (in, e.g., D. L. Donoho and X. Huo, “Uncertainty principles and ideal atomic decomposition,” IEEE Trans. Inform. Theory, vol. 47, pp. 2845-2862, Nov. 2001; or in R. Gribonval, R. M. F. i Ventura, and P. Vandergheynst, “A simple test to check the optimality of a sparse signal approximation,” Signal Process., vol. 86, pp. 496-510, Mar. 2006), my instincts say that we shouldn’t expect to find a representation that is much more sparse than \(K\). The union of these two bases creates a dictionary with the smallest possible coherence, and the largest possible spark — which is \(2\sqrt{K}\), only for \(\sqrt{K}\)-periodic pulse trains. This means that we have few opportunities to reduce several elements from one basis into a few elements of the other basis. The best that can happen is we reduce \(K\) sines into a single impulse; but that isn’t likely to happen for our white noise process.

Now, as we pump up the coherence by including more bases, or frames, the spark of the dictionary will decrease, and we will have more opportunity to benefit from the linear dependence. I believe that is when we can expect the minimum sparsity of our solution to drop below \(K\). But how much below?