We have started a real case study concerning the applicability of current off-the-shelf music information retrieval tools and techniques for addressing a variety of interesting questions. Sonic Visualiser is an excellent tool with which to begin since it is free and accessible, provides nice visualisations, and facilitates many flexible features that are state of the art through VAMP plugins. It should be the ideal “turn key system” for anyone wishing to analyse recorded music audio.
To begin, I used Sonic Visualiser to inspect the track “The Jellicle Ball” from the musical Cats (original cast recording). A screen shot of this is below: sound data (top) and its time-frequency energy distribution (bottom, color), with some annotations.
The music in this track begins from 1:10 here with the ostinato F#-(FG). The music in the video recording is at a much slower tempo than the one in the “original cast” recording. Beginning at 3:30 in the sound recording is this part.
Anyhow, you can see some of the events in the markup I made. (Human) voices only appear in the first minute and a half. Then there is some development (accompanying acrobatic cat dancing), and finally the “pop” section comes to fore with electric bass, a drum kit, staccato piano right hand, an electric guitar, and finally (something that sounds like) wailing cats and a ritardando with a return to material from the introduction. We clearly see many half-step key changes in the pop section.
The next step is to run beat tracking, chord and key detection, segmentation, pitch detection, melody extraction, etc., and analyse the results. How much human intervention will be needed to clean them?
NB: My use of a track from Cats in no way reflects my musical tastes, or lack thereof. :)