Weak contracts and machine learning: A presentation by Léon Bottou

These ICML2015 slides of Léon Bottou, Two high stakes challenges in machine learning, make several great points. The first is that the train/test paradigm in machine learning/artificial intelligence actually embodies creating systems having a weak “contract”. An example Bottou gives is of an object recognition system that is advertised with some accuracy. If one submits to that system data differing from the test set distribution, nonsense will result, and the system no longer works. This is in comparison to a sorting algorithm, which can sort any numerical data no matter its composition. The sorting algorithm thus does not have a weak contract. The answer of “more data” to improve performance of a system with a weak contract is empty, given the bias that seems to necessarily result.

The second point Bottou makes is that machine learning/artificial intelligence is all three: exact science, experimental science, and engineering. It is necessary that it is all three; however, trouble can arise when the “genres” are mixed. For instance, claiming that a system with a some estimated test error proves something of an exact nature. Third, Bottou points out that the experimental science of machine learning/artificial intelligence has been “dominated” by the train/test experimental paradigm … and this is challenging “the speed of our scientific progress.”

Bottou motivates increasing the ambitions of machine learning/artificial intelligence from building systems with weak contracts (reproducing X amount of ground truth of a dataset), to building systems that learn concepts: “In fact, a system that recognizes a “concept” fulfils a stronger contract than a classifier that works well under a certain distribution.” Bottou also recognizes that such an increased ambition necessarily leads to evaluation that is not as convenient as comparing labels to ground truth.

Bottou’s presentation encompasses much of what I am saying in machine music listening. I think we all want systems to learn concepts. Measuring the amount of ground truth reproduced by a system is no relevant measure of that. The train/test paradigm must be replaced.

Composition is not research

This is hilarious — and well-argued at the same time.

“If Einstein had not existed, someone else would have come up with Relativity. If Beethoven had not existed, nobody would have written the Ninth Symphony. … Einstein corrects and supersedes Newton; Schoenberg does not correct and supersede Bach. One can understand Gauss’s flux theorem perfectly well never having read a word of Gauss; one cannot understand Debussy’s music without ever hearing a note of it. … The imagination needed for scientific and other research, and the occasional sense in art that there is something waiting to be discovered, should not blind us to this crucial difference.”

J. Croft, “Composition is not research“, Tempo 69(272), pp 6-11, Apr. 2015/

Clever Hans, Clever Algorithms: Are your machine learnings learning what you think?

I had a nice time delivering a talk at the London Big-O Meet Up the other day — which can be seen at the Skills Matter website. The discussion afterward gave me some great perspectives from people in industry, such as Beautiful Destinations, facebook, and a variety of data science start-ups… not to mention the fascinating world of kaggle competitions. This meet up is going in my calendar!

My slides are here.

PhD Studentship in Intelligent Machine Music Listening

Please have a look here: http://www.eecs.qmul.ac.uk/phd/research-topics/funded#phd-studentship-in-intelligent-machine-music-listening.

Applications are invited for a fully-funded PhD studentship, to seek ways to exploit novel and holistic approaches to evaluation for building machine music listening systems (and constituent parts). A major emphasis will be on answering “how” systems work and “what” they have learned to do, in relation to the success criteria of real-world use cases. The research will involve working at the intersection of digital signal processing, machine learning, and the design and analysis of experiments.

All nationalities are eligible to apply for this studentship, which will start in Autumn 2015. The studentship is for three years, and covers student fees as well as a tax-free stipend of £15,863 per annum.

Candidates must have a first-class honours degree or equivalent, or a good MSc Degree in Computer Science, Electronic Engineering, or Mathematics. Candidates should be confident in digital signal processing or machine learning, and have programming experience in, e.g. R, MATLAB, or Python. Experience in research and a track record of publications is very advantageous. Formal music training is also advantageous.

The PhD supervisors will be Dr. Bob L. Sturm (Machine Listening) and Dr. Hugo Maruri-Aguilar (Statistics). Please see http://www.eecs.qmul.ac.uk/~sturm for background. The project will be based in the School of EECS, and the student will become a member of the interdisciplinary Centre for Digital Music. Informal enquiries can be made by email to Dr. Sturm (b.sturm@qmul.ac.uk).

To apply, please follow the on-line process (http://www.qmul.ac.uk/postgraduate/apply) by selecting ‘Electronic Engineering’ in the ‘A-Z list of research opportunities’ and following the instructions on the right-hand side of the web page.

Please note that instead of the ‘Research Proposal’ we request a ‘Statement of Research Interests’. Your statement should answer two questions: (i) Why are you interested in the topic described above? (ii) What relevant experience do you have? Your statement should be brief: no more than 500 words or one side of A4 paper. In addition we would also like you to send a sample of your written work. This might be a chapter of your final year dissertation, or a published conference or journal paper. More details can be found at: http://www.eecs.qmul.ac.uk/phd/apply.php

The closing date for the applications is 1/05/15.

Interviews are expected to take place /15.during the week of 15/06