Going to use the Nottingham Music Database?

The “Nottingham Music Database” (NMD) has been appearing more and more in applied machine learning research and teaching over the past few years. It’s been used in tutorials on machine learning, and even educational books on deep learning projects. It’s been fun to generate music with computers for a very long time.

The music generation start-up company Jukedeck put some effort into cleaning an ABC-converted version of the database, offering it on github. Most recently, NMD appears in this submission to ICLR 2019: HAPPIER: Hierarchical Polyphonic Music Generative RNN. Seeing how that paper uses NMD, and the conclusions it draws from the music generated by the models it creates, I am motivated to look more closely at the NMD, and to propose some guidelines for using it in machine learning research.

Here is the source page of the “Nottingham Folk Music Database” by Eric Foxley, which “contains about 1200 folk melodies, mostly British & American. They mostly come from the repertoire over the years of Fred Folks Ceilidh Band, and are intended as music for dancing.” It is a very personal collection, as Foxley describes: “Most tunes have been collected over a lifetime of playing (which started when I sat in at the back of many bands in the London area and elsewhere from the age of 12 onwards), and the sources from whom I learnt the tunes are acknowledged. These are all collected “by ear”, and details change over time. The arrangements, harmonies, simplifications are entirely mine. Where there is a known printed source, that is included. I apologise for any unknowing omissions of sources, and would be happy to add them.” Based on the date of Foxley’s website, this collection seems to have been assembled before 2001.

Foxley provides a description of the contents here:

  • “Jigs. This directory contains about 350 6/8 single (mostly “crochet-quaver” per half bar) and double jigs (mostly quavers).
  • Reels. 2/4 and 4/4. This includes about 460 marches, polkas, rants etc.
  • Hornpipes. These are played (but not written) dotted. We include about 70 hornpipes, schottisches and strathspeys. See the “Playing for Dancing” document for the distinction.
  • Waltzes. About 50 tunes with 3/4 time signature.
  • Slip jigs. These are jigs in 9/8 time.
  • Miscellaneous. This directory contains just a few tunes, which we play mainly for listening to, when dancers need a breather.
  • Morris. Just a sample few, about 30. They include some chosen for listening to, and some from the Foresters Morris Men’s repertoire.
  • Some Christmas ones (15).
  • About 45 tunes from the Ashover collection, provided by Mick Peat.”

Not listed there, but included in The Tunes, are tunes taken from Playford’s 1651 book, The Dancing Master.

Foxley provides a note on the distribution of the database: “We are happy for others to use tunes from our repertoire; after all, the tunes [we] use were picked up from others, and the traditional tunes are best! We just hope that you play them properly and carefully, not as streams of notes but as phrased music making folks want to dance.”

Foxley also provides a warning: “The melodies as stored are my interpretation of the essence of the tune. Obviously no respectable folk musician actually plays anything remotely like what is written; it is the ornamentation and variation that gives the tune its lilt and style.”

Foxley appears to have assembled his collection for a few different purposes: 1) a collection for his group’s own music practice playing for dances and other events (see this page of tunes for specific weddings); 2) as material for researching music analysis and search and retrieval by computers.

NMD is thus a personal collection of an English folk music enthusiast and computer scientist with decades of experience in playing and dancing to this kind of music. Much of the collection is focused on dance music (jigs, reels, hornpipes, waltzes, slip jigs, Morris, Playford’s ), but some of it is specialised (Miscellaneous, Christmas). A small portion of the collection comes from another person (Mick Peat). While it is an extensive collection for a single person, it is not extensive for a tradition (compare to the Morris music collection at The Morris Ring). It should be emphasised what Foxley says: NMD is his collection of rough transcriptions of tunes that should never be performed as written, but when performed well should make “folks want to dance”.

Here’s the first three guidelines for using the NMD:

1. Do not believe that when you train a model on sequences from the NMD that your model is learning about music. Your trained model may show a good fit to held out sequences in NMD. Do not believe that this means it has learned about the music represented by the NMD. Your model is learning about sequences in the NMD. Those sequences are not music, but impoverished, coarse and arbitrary representations of what one experiences when this particular kind of music is performed. Also, the music represented in NMD is not “polyphonic”. Each sequence of NMD provides a sketch of the melody (which all melody instruments play), and harmonic accompaniment (which is not always present).

2. If you are working with a generative model, your trained model may produce sequences that appear to you like the sequences in NMD. Do not convert those sequences to MIDI and then listen to an artificial performance of them to judge their success. Do not submit those synthetic examples together with synthetic examples of tunes from NMD to a listening test and ask people to rate how pleasant each is. Do not assume that someone with a high degree of musical training knows about the kind of music represented in the NMD.

3. Find an expert in the kind of music represented in the NMD and work with them to determine the success of your model. That means you should submit sequences generated by your model trained on NMD to these experts so that they can evaluate them according to performability and dancability.

Let’s have a look at a real example from NMD. I choose one at random among those I have experience playing. Here’s Foxley’s transcription of “Princess Royal” from what he says is the Abingdon Morris tradition:

title = "\f3Princess Royal\fP";
ctitle = "AABCBCB";
rtitle = "\f2Abingdon\fP";
timesig = 4 4;
key = g;
autobeam = 2;
bars = 33.

d^<'A' c^< |
b"G" a"D" g"G" d^< c^< |
b"G" a"D" g"G" g^ |
e^."C" d^< c^ e^ |
d^."G" c^< b d^ |
c^ "Am" b "g" a "f+" g "e" |
f<"D7" g< a< "c+" f< d "b" d^< "a" c^< |
b<"G" a< b< g< a"D7" f | g>"G" g :| \endstave.

e^.'B'"C" e^< e^ d^ | e^"C" f^"d" g^>"e" |
g^"C/e" f^"d" e^"c" d^"b" |
b<"G/d" a< g< b< a >"D7" |
g"G" g a."D7" a< |
b<"G" a< g g^. f^< | g^"G" d^ e^>"C" |
d^"G" b c^>"C" | \endstave.
\5,8 |! \continue.

d^ 'C' c^ |
b>"G" a>"D" |
g>"Em" d^"D7" c^ |
\-2 |
g>"Em" g^> | \endstave.
e^>."C" d^ |
c^>"C" e^> |
d^>."G" c^ |
\timesig = 2 4. b."G" d^< |
\timesig = 4 4. c^"Am" b "g" a "f+" g "e"|
\6,8 |! \endstave.


Here’s the ABC conversion from the Sourceforge NMD:

X: 20
T:Princess Royal
% Nottingham Music Database
d/2c/2|"G"B"D"A "G"Gd/2c/2|"G"B"D"A "G"Gg|"C"e3/2d/2 ce|"G"d3/2c/2 Bd|
"Am"c"g"B "f#"A"e"G|"D7"F/2G/2"c#"A/2F/2 "b"D"a"d/2c/2|\
"G"B/2A/2B/2G/2 "D7"AF|"G"G2 G:|
"C"e3/2e/2 ed|"C"e"d"f "e"g2|"C/e"g"d"f "c"e"b"d|"G/d"B/2A/2G/2B/2 "D7"A2|\
"G"GG "D7"A3/2A/2|"G"B/2A/2G g3/2f/2|
"G"gd "C"e2|"G"dB "C"c2|"Am"c"g"B "f#"A"e"G|\
"D7"F/2G/2"c#"A/2F/2 "b"D"a"d/2c/2|"G"B/2A/2B/2G/2 "D7"AF|"G"G2 G||
dc |"G"B2 "D"A2|"Em"G2 "D7"dc|"G"B2 "D"A2|"Em"G2 g2|"C"e3d|"C"c2 e2|"G"d3c|
"Am"c"g"B "f#"A"e"G|"D7"F/2G/2"c#"A/2F/2 "b"D"a"d/2c/2|\
"G"B/2A/2B/2G/2 "D7"AF|"G"G2 G||

Here’s the ABC from the Jukedeck NMD cleaned collection:

X: 20
T:Princess Royal
% Nottingham Music Database
d/2c/2|"G"B"D"A "G"Gd/2c/2|"G"B"D"A "G"Gg|"C"e3/2d/2 ce|"G"d3/2c/2 Bd|
"Am"cB AG|"D7"F/2G/2A/2F/2 Dd/2c/2|\
"G"B/2A/2B/2G/2 "D7"AF|"G"G2 G:|
"C"e3/2e/2 ed|"C"ef g2|"C/e"gf ed|"G/d"B/2A/2G/2B/2 "D7"A2|\
"G"GG "D7"A3/2A/2|"G"B/2A/2G g3/2f/2|
"G"gd "C"e2|"G"dB "C"c2|"Am"cB AG|\
"D7"F/2G/2A/2F/2 Dd/2c/2|"G"B/2A/2B/2G/2 "D7"AF|"G"G4||
zz dc |"G"B2 "D"A2|"Em"G2 "D7"dc|"G"B2 "D"A2|"Em"G2 g2|"C"e3d|"C"c2 e2|"G"d3c|
"Am"cB AG|"D7"F/2G/2A/2F/2 Dd/2c/2|\
"G"B/2A/2B/2G/2 "D7"AF|"G"G4||

There’s something unusual in the Jukedeck processing. First, there is an F section that does not appear in the others, but just acts to balance the 3-beat bar before. Second, many of the bass notes (specified by a lower case letter) have been stripped out. Anyhow, by and large Foxley’s version and the Sourceforge NMD appear the same.

Let’s get a feeling for how this sequence becomes music, and how that functions together with a dancer. Below is the staff notation of the Abingdon version of Princess Royal (Foxley’s PDF resulting from his transcription) along with a video of a performance.

Screen Shot 2018-09-30 at 12.13.35 PM.png

There are several important things to notice here. 1) The written and performed melodies deviate in many places, just as Foxley says they should; 2) The accompanying harmony here is sometimes not what is notated; 3) The musician closely follows the dancer, allowing enough time for them to complete the steps (hops and such).

When it comes to the notated version of the sequence, look at how the parts are structured and how they relate to one another. In the A part, bars 5-8 relate to bars 1-4. Patterns in bars 3 and 4 mimic those in bars 2 and 3. The B part contrasts with A, but its conclusion echoes that of A. The first 7 bars of part C is the first four bars of part A with doubled note lengths; and its last four bars are the last four bars of part A. There’s a lot of structure there! And these kinds of structures and patterns exist throughout the sequences in NMD.

Here’s some more guidelines.

4. Look at how the sequences generated by your model trained in the NMD exhibit the same kind of structures and patterns of the sequences in the NMD. Are there similar kinds of repetitions and variations? How do the sections relate together? If you don’t see any of these kinds of things, your model is not working. If you don’t know what to look for, see guideline 3.

5. Do not train your sequence model on a MIDI conversion of the NMD. They are not the same. (The MIDI file created by Jukedeck from the tune above also has the wrong structure — AAAABCBCB instead of AABCBCB. Other midi files there are sure to have similar problems.) Training on MIDI conversions of the NMD will also add a lot more complexity to your model, and make training less effective. The ABC notation makes sequences that are quite terse, so why not take advantage of that?

Now let’s have a look at one of the examples generated by the HAPPIER model:

Screen Shot 2018-10-02 at 3.02.02 PM.pngThe very first event shows something is very wrong. Overall, the chord progression makes no sense, the melody is very strange, and the two do not relate. There is none of the repetition and variation we would expect given the NMD. None of the four examples presented in the HAPPIER paper look anything like music from the NMD. There is some step wise motion, so the HAPPIER model has that going for it; but it is clearly not working as claimed.

The HAPPIER paper claims the new model “generates polyphonic music with long-term dependencies compared to the state-of-the-art methods.” The paper says the HAPPIER models “perform better for melody track generation than the LSTM Baseline in the prediction setting” because their negative log likelihoods on sequences from NMD are lower. The paper also claims that HAPPIER model also “performs better in listening tests compared to the state-of-the-art methods”. The paper also claims that “the generated samples from HAPPIER can be hardly distinguished from samples from the Nottingham dataset.” None of these claims are supported by the evidence.

That brings up the final guideline.

6. If you are going to train a model on the NMD, or on this kind of melody-focused music, compare your results with folk-rnn. The code is freely available, it’s easy to train, and it works exceptionally well on this kind of music (when it is represented compactly, and not as MIDI). I have yet to see any model produce results that are better than folk-rnn in the context of this kind of music.


Machine Folk at ISMIR 2018

A group of us, called the Machine Folk Machine Folk, played through some folkrnn tunes at the ISMIR 2018 banquet on the Seine:

We also played the September 2018 machine folk tune of the month, The Silver Keyboard, but this was not captured on the video. Here’s my solo rendition on my mean green machine folk machine:

An experimental album of Irish traditional music and computer-generated tunes

albumcover.jpgFor the past 6 months, the music album “Let’s Have Another Gan Ainm” has been distributed to reviewers and listeners in Europe and the USA as a new release of Irish traditional music. We are now publicly revealing that each track on the album includes computer-generated material, specifically material generated by our deep neural network folk-rnn.

Reviews of the album, both published and private, have been very positive. The album even received radio play. More information about our experiment and the music on the album (e.g., how each  came to be) can be found in our technical report. We show exactly what the computer generated and the changes that were made. More details about the reception of the album will be provided at a later time.

In the meantime, enjoy the album!

Result of the first folk-rnn Composition Competition

The winning piece in the first folk-rnn composition competition is Gwyl Werin for mixed quartet by Derri Joseph Lewis. He used a tune generated by folk-rnn as a basis for both melodic fragments and harmonic construction in his piece. He chose the model trained without the repeat signs, a 9/8 meter, C mixolydian mode, an initialisation of “D E F”, and a temperature of 1.07. This produced the output here.

The judges found Lewis’ piece well balanced using nice contrasts and a variety of textures and motives in its construction. The occasional solo moments in the piece echo aspects of the generated material, though it does not imitate it directly. This piece illustrates a further approach to utilising folkrnn as part of the creative process. (For a recent survey, see Sturm, Ben-Tal, et al., “Machine learning research that matters for music creation: A case study”, J. New Music Research 2018.) We look forward to hearing the piece played by the New Music Players in our upcoming concert in October at the O’Reilly AI conference in London.

Machine Learning Research that Matters for Music Creation: A Case Study

Our article, Sturm, Ben-Tal, Monaghan, Collins, Herremans, Chew, Hadjeres, Deruty and Pachet, “Machine Learning Research that Matters for Music Creation: A Case Study”, is now in press at the Journal of New Music Research. The accepted version can be found here: http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233627

My one-line precis: We take several music generation models, apply them to music creation culminating in a public concert (May 23 2017, videos are at The Bottomless Tune Box Youtube page), and finally reflect broadly on the experience about how it matters for machine learning research and vice versa.

We used four different machine learning methods to compose a variety of musical works performed at the concert. We discuss the various technical and creative desicions made in the composition of the pieces. Each of the composers/musicians then reflects on the experience, answering questions about what machine learning contributed to their work, the roles of human and machine creativity and how they matter for the audience. We then summarise responses of the audience. The fifth section reflects on our total experience aligned with Kiri Wagstaff’s principles of making applied machine learning research matter:

  1. measure the concrete impact of an application of machine learning with practitioners in the originating problem domain;
  2. with the results from the first principle, improve the particular application of machine learning, the definition of the problem, and the domain of machine learning in general.

The penultimate section identified several ways our work contributes to machine learning research applied to music creation, or in general. In summary:

  1. Music creation via machine learning should be analysed along more varied dimensions than degrees of “success” or “failure”. A “successful” model (by quantitative measures of machine learning, e.g., cross-entropy) may still not generate interesting or useful music; and a “failing” model may result in creative opportunities. In any case, work with music experts/practitioners — it’s necessary and illuminating.
  2. A trained machine learning model that is useful and successful may still be totally naive of what it is doing. Work with music experts/practitioners to probe the “musical intelligence” of the model and its limits. This will reveal ways to improve the model, and make one’s discussion of the model more accurate and scientific.
  3. Music creators are particular and idiosyncratic. There is no “universal model” of music (just like there’s no “universal model” of dining). Hence, aim/expect to make machine learning models for music creation pliable enough for calibration to particular users. (How to calibrate a model without resorting to more data is an interesting research problem, and one of my goals in analysing the parameters of folkrnn models.)
  4. The data on which a model is trained does not necessarily limit its application. folk-rnn is trained on folk music of Ireland and the UK, but some of the music created with it doesn’t sound that way at all.
  5. In communities that design tools (software, hardware, analogue, etc.) for artists, it is probably well known that users will discover and exploit bugs and other unintended features in their work. (This fact motivates the requirement of backward compatibility of Csound in every update.) Expect unintended (mis)use, of a music creation model, and design them to encourage such opportunities.
  6. Music data is the remnants of a human activity. The machine learning researcher, in exploiting such data, has the responsibility to reflect on their use of it and its impact to the communities from which it comes. For instance, folk-rnn models are trained on thousands of transcriptions of folk music from Ireland. Our responsible use of that data involves appreciating and accurately portraying the living tradition, working with the data knowing that it is a deficient and distorted representation of the human experience, and working together with its experts and practitioners to assess the technical and ethical impacts of the research.

We hope that our article will serve as some sort of model for evaluating and thinking more broadly about applications of machine learning to music creation.