Compressed Sensing and Audio Compression

I just received the following response from an anonymous reader:

I just read your “Solved Problems in Audio Speech Processing?” blog entry and I have some comments. Compressed sensing is not the answer — it does not offer efficient compression anywhere near what we already have, as the associated bitrate is far from the rate-distortion bound. It can only serve as an intermediate step between raw samples and reconstruction using a sparse representation. The sparse representation itself, however, offers a much more compact description, and the people who think compressed sensing is the answer to anything like this haven’t understood compression at all. … As for people complaining about compression artifacts, they are forgetting that any reconstruction of their work is subject to loss, including CDs and analog ones. What mp3 and AAC offer is more efficient use of the available bandwidth, and they can be scaled to yield the same as CD quality — if you are willing to pay the price. This is not a compression technology problem.

I agree that compressed sensing is much more a method of acquisition than it is a method for compression. Furthermore, many audio signals have non-compressible elements, e.g., whispering speech, breathiness in the flute, the snare drum, etc. Thus, I don’t see compressed sensing as providing any sort of perceptually and bitrate competitive compression over even \(\mu\)-Law quantized Huffman-encoded audio signals — not to mention the embarrassingly high computational complexity at the decoder involved in solving the convex optimization problems at a cost of \(\mathcal{O}(1000 n\log n)\), and assuming the signal is sparse.

Advertisements

3 thoughts on “Compressed Sensing and Audio Compression

  1. Hi,
    I don’t understand what is your point. Do you mean that sparse coding as a part of compressed sensing is not compression?
    I agree with you on efficiency about mp3, but from a compressed sensor point of view it is because the signal to compress is a structure composed of sinusoids and that mp3 uses a MDCT basis as its dictionary.
    What about the compression of a Dirac (or an artifact)? We are faced to the Gibbs phenomenon, aren’t we?
    So the choice/learning of a dictionary is still a common problem of Compressed Sensing and compression of audio.
    Sorry to think that compressed sensing is the answer to anything like this and maybe I haven’t understood compression at all.
    Regards,
    Nico

    Like

  2. Hi Nicolas,
    The point of the anonymous commenter is that compressive sensing of an acoustic signal is not audio compression, and the two should not be compared or conflated.
    Cheers.
    -Bob.

    Like

  3. Compressed sensing of audio signal is compression but it is not an optimal one compared to the slew of method that already exist.
    But I think Compressed Sensing and audio present the interesting challenge of looking for something else than what the industry has generally been looking for in the past.
    When you say that sensors are cheap, I cannot but think of a way to use these microphones for something else than just capturing voice. In effect, it would seem to me that having many of these sensors would provide some way to get other information than just one or several voices. Sure BSS would be the next step, but I would even go further where I could certainly conceive that if you have many sensors, you produce an acoustic map of the scene of interest as opposed to just a compressed view of certain sources.
    Igor.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s