A Smattering of Papers from EUSIPCO 2012, pt. 4

And now for the last installment.

“Clustering Before Training Large Datasets – Case Study: K-SVD” by C. Rusu

Save computation by preprocessing the training data to reduce its size, and then apply K-SVD to learn a dictionary.

“Binarization of Consensus Partition Matrix for Ensemble Clustering” by B. Abu-Jamous, R. Fa, A. Nandi and D. Roberts

Take a dataset, cluster it in multiple ways, and then combine these results using a consensus procedure. I think that could be very useful for what I am about to do. One relevant reference is A. Weingessel, E. Dimitriadou, and K. Hornik, “An ensemble method for clustering,” in DSC 2003 Working Papers, 2003; and H. G. Ayad and M. S. Kamel, “On voting-based consensus of cluster ensembles,” Pattern Recognition, vol. 43, pp. 1943-1953, 2010.

“Iterated Sparse Reconstruction for Activity Estimation in Nuclear Spectroscopy” by
Y. Sepulcre and T. Trigano

This paper presents an approach for sparse decomposition by applying LARS to solve the LASSO for a given regularization parameter, and then decreasing the parameter. Here they are interested in only estimating the mean number of arrivals per unit time. Also I see I should read T. Zhang, “Adaptive Forward-Backward greedy algo- rithm for learning sparse representations,” IEEE Trans- actions on Information Theory, vol. 57, no. 7, pp. 4689- 4708, 2011.

“An Analysis Prior Based Decomposition Method for Audio Signals” by O. Akyildiz and I. Bayram

This paper proposes an approach to decomposing an audio signal into transient and tonal components. ilker makes his code available here. Aside from using the analysis formulation, it is essentially the same as in K. Siedenburg and M. Dörfler, “Structured sparsity for audio signals,” in Proc. Int. Conf. Digital Audio Effects (2011). It is also quite close to L. Daudet, “Sparse and Structured Decompositions of Signals with the Molecular Matching Pursuit”, IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 5, pp. 1808-1816, Sep. 2006.
Also quite close are: B. L. Sturm, J. J. Shynk, and S. Gauglitz, “Agglomerative clustering in sparse atomic decompositions of audio signals,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process., (Las Vegas, NV), pp. 97-100, Apr. 2008; and B. L. Sturm, J. J. Shynk, A. McLeran, C. Roads, and L. Daudet, “A comparison of molecular approaches for generating sparse and structured multiresolution representations of audio and music signals,” in Proc. Acoustics, (Paris, France), pp. 5775-5780, June 2008.

“Low Complexity Approximate Cyclic Adaptive Matching Pursuit” by A. Onose and B. Dumitrescu

I think this paper presents a sparse approximation algorithm, but it seems so strongly tied to estimating a slowly-varying FIR filter that it might not generalize. The paper cites the original cyclic matching pursuit work by Christensen and Jensen 2007, but does not say how the presented algorithm is different. This is probably stated in: A. Onose and B. Dumitrescu, “Cyclic Adaptive Matching Pursuit,” in Proc. ICASSP, Kyoto, Japan, Mar. 2012.

“Audio Source Separation Informed by Redundancy with Greedy Multiscale Decompositions” by M. Moussallam, G. Richard and L. Daudet

The paper presents the “jointly adaptive matching pursuit” to decompose audio mixtures. Is it a generalized version of: R. Gribonval, “Sparse decomposition of stereo signals with matching pursuit and application to blind separation of more than two sources from a stereo mixture,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., vol. 3, (Orlando, FL), pp. 3057-3060, May 2002?

“Adaptive Distance Normalization for Real-time Music Tracking”
by A. Arzt, G. Widmer and S. Dixon

Combining spectral and onset features, with the appropriate changes to the distance measure, significantly helps music alignment and real-time tracking.

“Assessment of Subjective Audio Quality From EEG Brain Responses Using Time-Space-Frequency Analysis” by
C. Creusere, J. Kroger, S. Siddenki, P. Davis and J. Hardin

This idea is very intriguing. Forget asking people about perceptual quality; just have them bring their brains in for testing. The experiments, appear to change the quality of audio by either lowpass filtering, or scaling something, and then EEG measurements are classified. After several passes, I can’t understand what is happening; but music by Beethoven and Blondie are involved.

“Catalog-Based Single-Channel Speech-Music Separation with the Itakura-Saito Divergence” by
C. Demir, A. T. Cemgil and M. Saraclar

The catalog are those jingles that will interfere with speech signals on, e.g., news channels. This approach can significantly decrease the word error rate for automatic speech recognition.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s