Previously, I wrote about the acceptance of one of my ISMIR 2016 papers. The paper has now undergone revision guided by some very helpful peer-reviews. (Thank you Anonymous!) It’s got a shiny new title, two figures, one table, 10 footnotes, and 36 references.
B. L. Sturm, “Revisiting Priorities: Improving MIR Evaluation Practices”, in Proc. ISMIR 2016.
While there is a consensus that evaluation practices in music informatics (MIR) must be improved, there is no consensus about what should be prioritised in order to do so. Priorities include: 1) improving data; 2) improving figures of merit; 3) employing formal statistical testing; 4) employing cross-validation; and/or 5) implementing transparent, central and immediate evaluation. In this position paper, I argue how these priorities treat only the symptoms of the problem and not its cause: MIR lacks a formal evaluation framework relevant to its aims. I argue that the principal priority is to adapt and integrate the formal design of experiments (DOE) into the MIR research pipeline. Since the aim of DOE is to help one produce the most reliable evidence at the least cost, it stands to reason that DOE will make a significant contribution to MIR. Accomplishing this, however, will not be easy, and will require far more effort than is currently being devoted to it.
Email me if you would like a copy.