At first, we looked at madmom tempo estimates of the field recordings in our collection and found impressive agreement with the “ground truth”. And then we looked at dynamic constrasts. Now we look at quantitative features describing the “spectral dissonance” of the music recordings in our collection.
We use the “dissonance” feature in the Essentia feature extraction library. This is not a musical descriptor, but rather a perceptual description of the relationships between spectral components in a 46 ms (at 44100 Hz sampling rate) or 43 ms (at 48000 Hz) frame of the recording. This feature is a number between 0 and 1, where 0 means the spectrum is totally “consonant”, and 1 means the spectrum is totally “dissonant”. The C++ code is here.
We start with our recording of the 1988 performance of Phase II playing “Woman is Boss”. The whole 10 minute recording is analysed below.
The grey line is the time-domain waveform, and the black line is the spectral dissonance feature. We see the spectral dissonance is pretty much distributed around 0.4. There’s a brief decrease of dissonance around 345 seconds (s), which is when someone in the crowd whoops.
Now let’s look at our recording of the 1980 performance of the Trinidad All Stars (playing “Woman on the Bass”).
Here we see a mean spectral dissonance value a bit higher than the previous recording. There are moments where the dissonance decreases, which seem to coincide with the moments when the treble drops out leaving the bass line, e.g., 355 s and 463 s. That is no surprise since we expect when fewer instruments are playing in a mixture, the more “consonant” the spectrum should be.
Now let’s look at our recording of the 1994 performance of Desperadoes playing “Fire Coming Down”.
Not really interesting. Those little drops in dissonance correspond to brief moments in the recording where the audio drops out.
Here’s the spectral dissonance features for Phase II Pan Groove playing “More Love” at the 2013 competition:
This one is a little more interesting, but we see again that this feature coincides with regions where fewer instruments are playing.
From all of these observations then, one thing is clear: This feature does not seem relevant for our context, or really informative of anything else; and what’s more, the term is very close to being misunderstood. It does not refer to “musical dissonance”, which is why I keep calling it “spectral dissonance”.