Every year in Trinidad and Tobago since 1963 (save one), the Panorama competition brings together steel bands in the country to compete for the title of being the Champions that year. As part of the DaCaRyH project, we assembled a collection of 93 recordings featuring the top one, two or three ranked Panorama peformances since 1963. We are looking at this smallish corpus, which has a duration of about 14 hours, through the lens of automated feature extraction, followed by human verification. There are several things about this collection of which we must be aware.
Here’s the 1988 performance of Phase II playing “Woman is Boss”.
The video above starts around 62 seconds into our recording. The figure below shows the first 20 seconds of the waveform (mean across stereo channels) and sonogram of our recording (scaled to -80 to 0 dB). The first 12 seconds feature the announcer talking about the group. The countdown of the tune starts around 12.5 seconds. We see the waveform has a significant DC bias. We also see that the recording is bandlimited to 0–9 kHz. And there’s a strange varying notch around 1.8 kHz. Another thing we find is that our recording is slightly higher pitched than the YouTube video, by around 20 cents.
So, our feature extraction pipeline should consider that the beginning of a recording could have narration. Since we are looking at recordings made over 50 years, we have to consider differences in recording technology and their impacts on the feature extraction. There’s also the problem of which recording version to trust. If we are going to look at tuning of the pans, we need a trustworthy reference. A difference of 20 cents is quite large, and casts doubt on the idea that we can extract tuning conventions from these recordings.
In 1980, the Trinidad All Stars won with their performance of “Woman on the Bass”. The video above shows the winning performance. Below is a portion of the waveform and sonogram (scaled to -60 to 0 dB) of our recording of it. The YouTube recording starts around 41 seconds into our recording (the countdown can be seen at the left of the sonogram). The first 41 seconds of our recording features an announcer introducing the band.
There doesn’t seem to be any major tuning discrepancy between these two recordings, but it is clear they were made in different locations at the competition. On the sonogram you can see at 60 seconds a rising chromatic pattern (around 15 seconds into the YouTube video). That dark frequency component that follows at around 900 Hz is someone “whoooooing” in crowd close to the microphone of our recording. In fact, the sound of the crowd is much more present in this recording than the music of the band. I don’t hear any whooooooing in the YouTube video.
So, our analysis of the extracted features should take care in discriminating the effects of the crowd and the band. The sounds of the crowd are an important part of this live music experience, but they will have an impact on extracted music features.
In 1994, the band Desperadoes won with the performance of “Fire Coming Down”, which you can see above. Below is the first 30 seconds of our recording.
We seem to have a warble in the sound, which also exists in the video recording above. Furthermore, the recordings of the second and third place performances for that year features the same warble. So it appears the same problems can occur over all recordings of a competition. This means a chronological analysis of this dataset will have to take care in separating the effects of the year’s recording setup, with the year’s representation of the music practice.
What does the best of the 93 recordings look like? Here is the 2013 Panorama Champions Phase II Pan Groove performing “More Love” at the 2013 competition:
Here is a sonogram of the conclusion of our recording of the performance.
There’s a lovely moment of contrast in the dynamics around 502 seconds, crescendoing into a percussive conclusion at around 512 seconds. The crowd screams and whistles after that point. This recording sounds professionally made, but even so this kind of music is extremely noisy, and naturally “tinny.” It will be a challenge to make sense of feature extraction routines tested on clean studio recordings of a few well-balanced instruments.
Next, we will take a look at some of the features we extract from the signals above.