# What’s in a $$||\text{Name}||_0$$?

Consider these two problems formulations:

for some $$\vx \in \mathcal{R}^K$$ and some fat full rank $$\MD \in \mathcal{R}^{K\times N}$$, and some positive error measure $$J$$, and some $$\delta \ge 0$$, and some integer $$0 \le C \le N$$, solve either:

1. $$\min_\vs || \vs ||_0 \; \text{such that} \; J(\vx, \MD\vs) \le \delta$$
2. $$\min_\vs J(\vx, \MD\vs) \; \text{such that} \; || \vs ||_0 \le C$$

These two problems are the same in that they both look for a model, or representation, or decomposition, of a vector from a set of other vectors, $$\vx = \MD\vs$$; but these two problems are different in that the first focuses on reducing the model order within a maximum error, and the second focuses on reducing error with a maximum model order. With the zero-norm, we don’t care about the particular values in the solution, just that they are zero or not zero.

A more relaxed version of these problems is given by

for some $$\vx \in \mathcal{R}^K$$ and some fat full rank $$\MD \in \mathcal{R}^{K\times N}$$, and some $$\delta \ge 0$$, and some integer $$0 \le C \le N$$, do:

1. $$\text{MPTK, please run until} \; || \vx – \MD\vs ||_2 \le \delta$$
2. $$\text{MPTK, please run for C iterations, thank you.}$$

In my brief career as a sparse approximationist, or overcomplete representationalist, or dictionary-based methodologist, I have used no less than six names to describe these two complementary problems, many of which I have seen in the literature:

• overcomplete representation
• This one attempts to emphasize that the signal representation is built from a redundant set of vectors. I abandoned it because the a sparse representation is not redundant like a frame decomposition, i.e., “sparsity” and “redundancy” are contradictory.

• sparse atomic estimation
• This one attempts to emphasize that we are really trying to estimate a small number of time-frequency parameters in a signal. I abandoned it because it conflates the signal and its model too much. Finding atoms is not the same as estimating directions of arrival.

• sparse atomic decomposition
• This one attempts to emphasize that what we are really doing is decomposing a signal into a small number of time-frequency atoms. However, decomposition implies to me recomposition, as in an inverse Fourier transform. And for audio signals I do not expect the solution to any of the above problems to give zero error. And its acronym is SAD.

• sparse atomic approximations
• This one is more on the money because what we are really doing is approximating a signal by a small number of time-frequency atoms. It is a bit specific however; what if the dictionary contains vectors that are not time-localized?

• sparse approximation
• This one emphasizes what is important: model order. And it says what we are doing: approximation. Plus, it is an academia standard.

• dictionary-based methods
• Finally, there is this odd duck, which I have used mainly in presentations of my work to the computer music modeling community, c.f., “Dictionary-based Methods for Audio Data,” tutorial presented at the Sound and Music Computing Conference, Porto, Portugal, July 2009. In audio signal processing, I am not as interested in sparsity as I am in finding an efficient and meaningful model of a signal with which I can pick apart its contents for analysis, extraction, description, etc. And the best way to do this is to find a good dictionary that can efficiently describe that for which I am looking. The flexibility is in the specification of the dictionary, whether created by modulating, translating, and scaling prototype functions, or learning features from a training set. Whatever method is used, greedy, convex optimization, thresholding, they are using a dictionary, which may or may not be overcomplete, and which may or may not be atomic.

Finally, my blogging software recommended this picture with a content analysis of the above text:

Image by spike55151 via Flickr

Maybe it was my “odd duck” comment.