Last time, we looked at some of the recordings in our dataset and identified several peculiarities: spoken announcements and introductions at the beginning of recordings, sometimes lasting 10s of seconds; crowd noises throughout, and sometimes much more perceivable than the music; differences in pitch shifting between recordings; recording effects like warble. Furthermore, there are not very many well-defined markers of tempo save the countoff at the beginning of a tune. I find it very hard to tap to the beat when I select random starting positions in a recording. How will our feature extraction algorithms handle this with our recording collection?
Let’s look at tempo at this time. We use two tempo description algorithms. One is the QMUL Tempo and Beat Tracker Vamp plugin, which gives a tempo estimate whenever a change is sensed. The other is madmom tempo, which gives a tempo estimate for the entire piece.
We start with our recording of the 1988 performance of Phase II playing “Woman is Boss”. The whole 10 minute recording is analysed. The black line is the tempo estimates from the QMUL Tempo and Beat Tracker Vamp plugin, and the blue line is from madmom tempo.
Using Tempo Tap, I estimate a tempo of about 140 beats per minute (bpm). madmom says it’s 139 bpm. Let’s go with madmom. For this recording, the first 40 seconds is an announcement. From then to the just about the end is the music. Still, the QMUL tempo tracker is making an octave error for the majority of the recording.
Now let’s look at our recording of the 1980 performance of the Trinidad All Stars (playing “Woman on the Bass”).
Again we see octave errors in the QMUL tracker. madmom estimates a tempo of 136 bpm. I estimate it to be around 135 bpm.
Now let’s look at our recording of the 1994 performance of Desperadoes playing “Fire Coming Down”.
Here I estimate 131 bpm but madmom says 136.
Here’s Phase II Pan Groove playing “More Love” at the 2013 competition:
I estimate 121 bpm. madmom says 122 bpm.
The oldest recording in our collection is from the first Panorama in 1963. It features the Pan Am North Stars playing an arrangement of “Dan Is The Man”:
Our recording is old enough that it can be auditioned at CREM. Here’s the tempo:
Nice and calm for both of them at 113 bpm. Which is what I count.
From what we have seen, it seems the madmom tempo is actually a reliable estimate of the tempo. Let’s look at the entire collection of tempo estimates:
Nearly all of the tempo estimates of our 93 recordings are between 115 and 140 bpm, but there are some that are suspiciously slow or fast. The slowest is the recording of the 1982 performance of Amoco Renegades playing “Pan Explosion”:
According to my tapulations, this is more like 137 bpm (our recording has a slightly slower speed and lower pitch than the video above).
The fastest tempo estimate of madmom is of the 1985 recording of the Trinidad All Stars playing “Soucouyant.” Here a video where they start playing at a tempo of around 140 bpm but end around a tempo of 145 bpm.
The performance in our recording is faster! I tapstimate it starts around 147 bpm and ends around 154 bpm. So, it seems madmom is not entirely incorrect with our recording, but the performance in our recording may not be accurate.
For the other seven supposedly slow performances I find four tempo estimates that are clearly wrong:
CNRSMH_I_2011_042_001_02 madmom: 100 bpm, me: 136 bpm
CNRSMH_I_2011_045_001_02 madmom: 102 bpm, me: 138 bpm
CNRSMH_I_2011_041_001_02 madmom: 102 bpm, me: 137 bpm
CNRSMH_I_2011_042_001_03 madmom: 105 bpm, me: 143 bpm
CNRSMH_E_2016_004_193_001_03 madmom: 105 bpm, me: 106 bpm
CNRSMH_E_2016_004_193_001_01 madmom: 114 bpm, me: 114 bpm
CNRSMH_E_2016_004_194_001_06 madmom: 111 bpm, me: 110 bpm
For the other two supposedly fast performances I find the tempo estimates are ok:
CNRSMH_E_2016_004_193_002_05 madmom: 146 bpm, me: 148 bpm
CNRSMH_E_2016_004_193_002_02 madmom: 143 bpm, me: 143 bpm
What about all the ones in the middle range? Should we verify all of them? Even so, what conclusions can we make about the tempo conventions considering that our recordings may not accurately reflect the practice?