Last week at the conclusion of ICML 2015, I attended the day-long workshop Machine Learning for Music Discovery. The day started with a series of invited talks, followed by lunch, and continued with accepted presentations. It concluded with a great “happy hour”, with open tabs thanks to Pandora!
The first presenter was Brian McFee, who spoke on “The Role of Structure Analysis in Music Discovery”. He demonstrated his excellent interactive song structure visualisation tools: seymour, and Laplacian structure decomposition visualization. I particularly liked his story about building such tools to study jazz music.
Phillipe Hamel was next, and spoke about music recommendation for international users. The thought had not occurred to me yet, but it is a big problem when your customer base is global, but the majority of users come from only a few countries. How can one hope to build a useful recommendation engine?
Then Aurthur Flexer made the case for being aware of “hubs” and ways of avoiding them. A hub is a data point that is a close neighbor to many points, not due to similarity but to the fact that the points are in a high-dimensional space. This is rather annoying in a recommendation system because such points can be returned for a majority of queries. Flexer showed how the impact of hubs — which are not just a problem with any dataset — can be reduced by adaptively rescaling distances in the space.
Sander Dieleman then presented his work on using convolutional deep neural nets for music recommendation — a result of his internship at Spotify last year. A very interesting part of his work is in trying to determine the disposition of filters in the various layers. He has assembled playlists of test samples that have particularly high activations in the learned filters, and then envisions what might be common among them. (There has to be a better way to do this since it is rather arbitrary. For instance, “filter 3” is highly activated by “Christian rock.”) Anyhow, Sander appears to be one of the few that is applying deep learning at scale for music.
Then I spoke. I presented three case studies: 1) deep learning features may or may not be useful for MIR; 2) solving dataset classification problems may not be useful/relevant; 3) Potemkin villages and horses. Along with my collaborators, we are trying to explain and resolve these issues using the formal design and analysis of experiments. Much more to come!
The last invited talk was by Geoffroy Peeters. He reviewed several past and current projects of his and made the case of how feature design is still state of the art. He reminded me that I need to look at iVectors… which are not by Apple. And universal background modelling.
The accepted talks helped round out the topics of the day. Matthew Prockup gave a convincing and visually pleasing presentation about rhythm modeling. (His papers will appear at WASPAA 2015 and ISMIR 2015.) Dawen Liang gave a talk on combining tags and collaborative filtering for music recommendation. Thomas Wilmering discussed “legal bootlegs”, and extracting features from large live music recording archives. Cédric Mesnage brought Twitter to the fore as a creative way to push recommendations to the tail. Keunwoo Choi gave a straightforward talk about playlists, and the problems with evaluating them. Finally, Roderick Murray-Smith finished the day by demonstrating a visually stunning application for music browsing.
All in all, I think the event offerred a great experience in a nice place, with excellent organisation. The opportunity to be social afterwards at L’Alchemist was a good touch. I thank the organisers for the invitation to present, and congratulate them on the success.