Hello and welcome to Paper of the Day (Po’D): Demixing Pursuit Edition.

Today’s interesting paper comes from a conference in 2006

devoted to blind source separation (BSS):

S. Lesage, S. Krstulovic, and R. Gribonval,

“Underdetermined source separation:

Comparison of two approaches based on sparse decompositions,”

in Proc. Int. Conf. Independent Component Analysis Blind Source Separation,

(Charleston, South Carolina), pp. 633-640, Mar. 2006.

The authors compare two approaches to BSS using sparse approximation,

both of which know the mixing matrix.

This is in contrast to Gribonval’s stereo matching pursuit (reviewed Monday)

where the mixing matrix (and number of sources)

is estimated from the decomposition parameters,

and each source is reconstructed from the relevant atoms.

The authors propose two methods that assume

near perfect knowledge of the mixing matrix (and thus the number of sources).

In the first case, “multichannel matching pursuit,”

the authors decompose all mixtures in parallel with MP and a monochannel dictionary.

After decomposition each atom is assigned to a source

based on maximizing the magnitude projection of its weights (across mixtures)

onto each row of the mixing matrix (where \(\MX = \MA\MS\)

are the mixtures, and \(\MA\) is the mixing matrix).

They also propose a variant of this by relaxing the assumption of

“one atom one source,” so that an atom can belong to several sources.

In the second case, “demixing pursuit,” a monochannel dictionary is turned into

a dictionary of “directional” multichannel atoms created with the mixing matrix in mind.

In contrast to the first approach, the mixing matrix is incorporated into the dictionary.

MP is then run on the matrix of mixtures,

and the sources are reconstructed as before.

The authors perform several tests on mixtures of three signals:

“piano,” “drums,” and “cello” (I still have no idea if the instruments are playing together,

as in a wedding band, or if they are from separate pieces, as in different cars in the Parisian metro).

They consider two different dictionaries: a complete short-term Fourier transform dictionary (monoresolution),

and an overcomplete multiresolution time-frequency dictionary.

The authors also perform “smoothing” in the sparse approximation process,

which essentially performs several decompositions with the signal slightly

shifted relative to the dictionary, and then an averaging of the decomposition parameters

(which sounds a lot like the method proposed in

P. J. Durka, D. Ircha, and K. J. Blinowska, “Stochastic time-frequency dictionaries for matching pursuit,”

IEEE Trans. Signal Process., vol. 49, pp. 507-510, Mar. 2001).

For three distortion measures (source-artifact, source-interference, and source-distortion),

the authors find that the demixing pursuit performs more competitively

with the larger dictionary as the approximation model order increases.

The authors also explored the effect of imprecise knowledge of the mixing matrix

by adjusting all directions by a small angle \([-\pi/16,\pi/16]\).

The source-interference distortion measure shows the greatest variability in this case.