An SQL error has kept me locked out of my blog for the past month. Now, I can finally post the final results of my experiments with OPF. Previously, I discussed how my reproduction of the optimum path forest approach to music genre recognition does not generate results near those reported, until I train and test with the same dataset. I have now run the same experiment, but used a partitioning of GTZAN that considers the duplication of artists, and its faults. (My work on GTZAN is now available at arxiv.) I predicted before the classification accuracy to drop “from 74 to at least 55”. Let’s see how I did!
First we look at the classification of all 23 ms segments. Quite poor in all degrees.
(True classes are columns. In percentages, precision is last column on right, F-score is last row, recalls along diagonal, and accuracy is bottom right corner.)
So, the performance of OPF in GTZAN has gone from the 99.8% published in ISMIR2011 (which comes from testing on the training data), to 76% without using artist and fault filtering, to 47% with artist and fault filtering, finally to 45% taking into consideration the mislabelings in GTZAN.