Po’D: A Survey of Evaluation in Music Genre Recognition

Hello, and welcome to the Paper of the Day (Po’D): A Survey of Evaluation in Music Genre Recognition. Today’s paper is B. L. Sturm, “A Survey of Evaluation in Music Genre Recognition“, Proc. Adaptive Multimedia Retrieval, Copenhagen, Denmark, Oct. 2012.

This paper is best summarized by a particularly riveting line of section 2.2:

The most-used publicly available dataset in music genre recognition work is that produced in [378,379], often called “GTZAN.” This audio dataset appears in more than 23% (96) of the references [5,11,14,16,18,27,33,35-40,53,57,58, 84,91,106,107, 109, 114, 130, 131, 136, 138, 142, 143, 163, 164, 177, 182, 191, 199, 201, 202, 204-206, 208, 209, 212-215, 217, 218, 223, 236, 237, 240, 241, 246, 270, 272, 285-290, 314, 318, 319, 322, 323, 325, 331, 336, 337, 339-341, 344, 345, 362-366, 368, 371-374, 377-379, 398,399,402,404,405, 407,411,416].

The numbers just sort of roll off the tongue. I think I might approach the presentation of this paper like at a humanities conference, where I read it. Aloud. With no slides. It is really only 7 pages of text, and 14 pages of references. I can skip the references.


And in the style of Harvard author name and date referencing, here is the first line of my paper:

Despite much work [Abeßer et al., 2008, 2009, 2010, 2012, Ahonen, 2010, Ahrendt et al., 2004, 2005, Ahrendt, 2006, Almoosa et al., 2010, Anan et al., 2011, And ́en and Mallat, 2011, Anglade et al., 2009a,b, 2010, Annesi et al., 2007, Arabi and Lu, 2009, Arenas-Garcia et al., 2006, Ariyaratne and Zhang, 2012, Aryafar and Shokoufandeh, 2011, Aryafar et al., 2012, Aucouturier and Pachet, 2002, 2003, Aucouturier and Pampalk, 2008, Aucouturier, 2009, Avcu et al., 2007, Bagci and Erzin, 2006, Ba ̆gci and Erzin, 2007, Balkema, 2007, Balkema and van der Heijden, 2010, Barbedo and Lopes, 2007, Barbedo, 2008, Barbieri et al., 2010, Barreira et al., 2011, Basili et al., 2004, Behun, 2012, Benetos and Kotropou- los, 2008, 2010, Bergstra et al., 2006, Bergstra, 2006, Bergstra et al., 2010, Bickerstaffe and Makalic, 2003, Bigerelle and Iost, 2000, Blume et al., 2008, Brecheisen et al., 2006, Burred and Lerch, 2003, Burred, 2004, 2005, Burred and Peeters, 2009, Casey et al., 2008, Cataltepe et al., 2007, Chai and Vercoe, 2001, Chang et al., 2008, 2010, Charami et al., 2007, Charbuillet et al., 2011, Chase, 2001, Chen et al., 2006, 2008, 2009, Chen and Chen, 2009, Chen et al., 2010, Chew et al., 2005, Cilibrasi et al., 2004, Cilibrasi and Vitanyi, 2005, Cor- nelis et al., 2010, Correa et al., 2010, Costa et al., 2004, 2011, 2012b,a, Craft et al., 2007, Craft, 2007, Cruz-Alc ́azar and Vidal, 2008, Dannenberg et al., 2001, Dannenberg, 2010, DeCoro et al., 2007, Dehghani and Lovett, 2006, Dellandrea et al., 2005, Deshpande et al., 2001, Dieleman et al., 2011, Diodati and Piazza, 2000, Dixon et al., 2003, 2004, 2010, Doraisamy et al., 2008, Doraisamy and Golzari, 2010, Downie et al., 2005, Downie, 2008, Downie et al., 2010, Draman et al., 2010, 2011, Esmaili et al., 2004, Ezzaidi and Rouat, 2007, Ezzaidi et al., 2009, Fadeev et al., 2009, Fernandez et al., 2011, Fern ́andez and Ch ́avez, 2012, Fiebrink and Fujinaga, 2006, Flexer et al., 2005, 2006, Flexer, 2006, 2007, Flexer and Schnitzer, 2009, 2010, Frederico, 2004, Fu et al., 2010a,b, 2011a,b, Garc ́ıa et al., 2007, Garcia-Garcia et al., 2010, Garc ́ıa et al., 2012, Gedik and Alpkocak, 2006, Genussov and Cohen, 2010, Gjerdingen and Perrott, 2008, Golub, 2000, Golzari et al., 2008a,c,b, Gonz ́alez et al., 2010, Goto et al., 2003, Goulart et al., 2011, 2012, Gouyon et al., 2004, Gouyon and Dixon, 2004, Gouyon, 2005, Grimaldi et al., 2003, 2006, Grosse et al., 2007, Guaus, 2009, Hamel and Eck, 2010, Han et al., 1998, Hansen et al., 2005, Harb et al., 2004, Harb and Chen, 2007, Hartmann, 2011, Heittola, 2003, Henaff et al., 2011, Herkiloglu et al., 2006, de la Higuera et al., 2005, Hillewaere et al., 2012, Holzapfel and Stylianou, 2007, 2008a,b, 2009, Homburg et al., 2005, Honingh and Bod, 2011, Hsieh et al., 2012, Hu and Ogihara, 2012, In ̃esta et al., 2009, ISMIR, 2004, ISMIS, 2011, Izmirli, 2009, Jang et al., 2008, Jennings et al., 2004, Jensen et al., 2006, Jiang et al., 2002, Jin and Bie, 2006, Lu et al., 2009, Jothilakshmi and Kathiresan, 2012, Ju et al., 2010, Kaminskas and Ricci, 2012, Karkavitsas and Tsihrintzis, 2011, 2012, Karydis, 2006, Karydis et al., 2006, Kiernan, 2000, Kim and Cho, 2011, Kini et al., 2011, Kirss, 2007, Kitahara et al., 2008, Kobayakawa and Hoshi, 2011, Koerich and Poitevin, 2005, Kofod and Ortiz-Arroyo, 2008, Kosina, 2002, Kostek et al., 2011, Kotropoulos et al., 2010, Krumhansl, 2010, Kuo and Shan, 2004, Lambrou et al., 1998, Lampropoulos et al., 2005, 2010, 2012, Langlois and Marques, 2009a,b, Lee and Downie, 2004, Lee et al., 2006, 2007, 2008, 2009b,a,c, 2011, Lehn-Schioler et al., 2006, de Leon and Inesta, 2002, de Le ́on and In ̃esta, 2003, 2004, de Leon and Inesta, 2007, de Leon and Martinez, 2012, Levy and Sandler, 2006, Li et al., 2003, Li and Tzanetakis, 2003, Li and Ogihara, 2004, Li and Sleep, 2005, Li and Ogihara, 2005, 2006, Li et al., 2009, 2010, Li and Chan, 2011, Lidy and Rauber, 2003, Lidy, 2003, Lidy and Rauber, 2005, Lidy, 2006, Lidy et al., 2007, Lidy and Rauber, 2008, Lidy et al., 2010b,a, Lim et al., 2011, Lin et al., 2004, Lippens et al., 2004, Liu et al., 2007, 2008, 2009a,b, Lo and Lin, 2010, Loh and Emmanuel, 2006, Lopes et al., 2010, Lukashevich et al., 2009, Lukashevich, 2012, M. et al., 2011, Mace et al., 2011, Manaris et al., 2005, 2008, 2011, Mandel et al., 2006, Manzagol et al., 2008, Markov and Matsui, 2012, Marques and Langlois, 2009, Marques et al., 2010, 2011b,a, Matityaho and Furst, 1995, Mayer et al., 2008b, Mayer and Rauber, 2010a,b, Mayer et al., 2010, Mayer and Rauber, 2011, McKay and Fujinaga, 2004, McKay, 2004, McKay and Fujinaga, 2005, 2006, 2008, McKay, 2010, McKay and Fujinaga, 2010, McKay et al., 2010, McKinney and Breebaart, 2003, Meng et al., 2005, Meng and Shawe- Taylor, 2008, Mierswa and Morik, 2005, MIREX, 2005, 2007, 2008, 2009, 2010, 2011, 2012, Mitra and Wang, 2008, Mitri et al., 2004, Moerchen et al., 2005, 2006, Nagathil et al., 2010, 2011, Nayak and Bhutani, 2011, Neubarth et al., 2011, Neu- mayer and Rauber, 2007, Nie et al., 2009, Nopthaisong and Hasan, 2007, Norowi et al., 2005, Novello et al., 2006, Orio, 2006, Orio et al., 2011, Pampalk et al., 2003, 2005, Pampalk, 2006, Panagakis et al., 2008, 2009a,b, 2010a,b, Panagakis and Kotropoulos, 2010, Paradzinets et al., 2009, Park, 2009a,b, 2010, Park et al., 2011, Peeters, 2007, 2011, In ̃esta and Rizo, 2009, P ́erez et al., 2010, P ́erez-Sancho et al., 2005, P ́erez et al., 2008, Perez et al., 2008, 2009, P ́erez, 2009, Pohle, 2005, Pohle et al., 2006, 2008, 2009, Porter and Neuringer, 1984, Pye, 2000, Rafailidis et al., 2009, Rauber and Fru ̈hwirth, 2001, Rauber et al., 2002, Ravelli et al., 2010, Reed and Lee, 2006, 2007, Rin et al., 2010, Ren and Jang, 2011, 2012, Ribeiro et al., 2012, Rizzi et al., 2008, Rocha, 2011, Rump et al., 2010, Ruppin and Yeshurun, 2006, Salamon et al., 2012, Sanden et al., 2008, 2010, Sanden and Zhang, 2011a,b, Sanden et al., 2012, de los Santos, 2010, Scaringella and Zoia, 2005, Scaringella et al., 2006, Schierz and Budka, 2011, Schindler et al., 2012, Schindler and Rauber, 2012, Seo and Lee, 2011, Seo, 2011, Serra et al., 2011, Seyerlehner, 2010, Seyerlehner et al., 2010, 2011, Shao et al., 2004, Shen et al., 2005, 2006, 2010, Silla et al., 2006, 2007, 2008a,b, Silla and Freitas, 2009, Silla et al., 2009, 2010, Silla and Freitas, 2011, Simsekli, 2010, Soltau, 1997, Soltau et al., 1998, Song et al., 2007, Song and Zhang, 2008, Sonmez, 2005, Sordo et al., 2008, Sotiropoulos et al., 2008, Srinivasan and Kankanhalli, 2004, Sturm and Noorzad, 2012, Sturm, 2012a,b, Sundaram and Narayanan, 2007, Happi Ti- etche et al., 2012, Tsai and Bao, 2010, Tsatsishvili, 2011, Tsunoo et al., 2009a,b, 2011, Turnbull and Elkan, 2005, Typke et al., 2005, Tzagkarakis et al., 2006, Tzanetakis et al., 2001, Tzanetakis and Cook, 2002, Tzanetakis, 2002, Tzanetakis et al., 2003, Umapathy et al., 2005, Valdez and Guevara, 2011, Vatolkin et al., 2010, 2011, Vatolkin, 2012, V ̈olkel et al., 2010, Wang et al., 2008, 2009, 2010, Weihs et al., 2007, Welsh et al., 1999, West and Cox, 2004, 2005, West and Lamere, 2007, West, 2008, Whitman and Smaragdis, 2002, Wiggins, 2009, Wu et al., 2011, Wu ̈lfing and Riedmiller, 2012, Xu et al., 2003, Yang et al., 2011a,b, Yao et al., 2010, Yaslan and Cataltepe, 2006a,b, 2009, Yeh and Yang, 2012, Ying et al., 2012, Yoon et al., 2005, Zanoni et al., 2012, Zeng et al., 2009, Zhang and Zhou, 2003, Zhang et al., 2008, Zhen and Xu, 2010a,b, Zhou et al., 2012, Zhu et al., 2004], music genre recognition (MGR) remains a compelling problem to solve by a machine.

Advertisements

2 thoughts on “Po’D: A Survey of Evaluation in Music Genre Recognition

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s