“Lisl’s Stis”: Recurrent Neural Networks for Folk Music Generation

Addendum (Sep 3 2018):

If you want to link to our work, please do not link to this page. Link to the folk-rnn project: https://github.com/IraKorshunova/folk-rnn

The following links show how this work has developed.

Sep. 22 2018 An experimental album of Irish traditional music and computer-generated tunes
Aug. 30 2018 Pre-print of “Machine Learning Research that Matters for Music Creation: A Case Study“
July 20 2018 launch of the machine folk session
May 15 2018 launch of folk-rnn composition competition
Apr. 26 2018 Machine folk session at QMUL
Feb. 1 2018 folkrnn.org application
Nov. 20 2017 “Music in the age of artificial creation — An illustrated concert” part of the 2017 Being Human Festival
June 28 2017 folk-rnn tunes performed at the QMUL Ideas Festival
June 16 2017 folk-rnn is the June Rhythm artist at loquantur rhythm
June 15 2017 On folk-rnn cheating and music generation
June 2 2017 article, “An A.I. in London is Writing Its Own Music and It Sounds Heavenly“
A full list of music created by and with folk-rnn is at the code repo here.
Partnerships concert showcasing works composed with folk-rnn — here is the YouTube channel of the concert
The Bottomless Tune Box YouTube Channel
Article in The Conversation, ‘Machine folk’ music composed by AI shows technology’s creative side (March 31, 2017)
The Drunken Pint, a folk-rnn original
Taking A Christmas Carol Toward the Dodecaphonic by Derp Learning
Deep learning for assisting the process of music composition (part 4).
The Endless Traditional Music Session
‘Tis the season for some deep carols
My composition, “Eight Short Outputs“

—-

Inspired by this blog post on recurrent neural networks (RNN), and by the wonder of reproducible research and the exceptional coding of Andrej Karpathy, today I dropped exactly everything I had planned to do and trained RNNs to generate “folk music”.

Here is the first folk piece, which the system names “Lisl’s Stis”:

Perhaps the pickup is more of a grace note, but it is clear that the 9/8 time signature is not correct. The key signature works, and the IV-V-I resolution is good with the octave jump down.
Here is another, named “Quirch cathp’3b (The Nille L’ theys Lags Bollue’s)”

Now we have a tune in 6/8, but the last measure is missing one eighth note and has an unnecessary natural. As “Lisl’s Stis”, “Quirch” begins and resolves to the tonic specified by the key signature. I like how it fiddles around in either ii or VI before resolving.

These are of course short tunes that I have hand selected from the virually unlimited output from sampling the RNN. Here is an example of what the raw output looks like:
T:Lat canny I. the dlas. M:C L:1/8 Q:1/2=100 K:D A>A|:F>DEA|F2dF|A/F/A/B/ AF|G/E/E FD |DDDG|Edec|defd |eged|fdgd|dcd2|| e|g2ef gef(e/c/)|ddfe fdAA|F3 A c4|efef g{e}d4 | gfga afgf|eggb ad'eg|fgdB edAB|BedA BABg|fdde ddd:|

This format of music notation is called ABC, and provides an extremely economic and interpretable format of music (monophonic typically, but polyphony is possible too). For instance, here is Volume 1 of “A Selection of Scotch, English, Irish and Foreign Airs adapted to the Fife, Violin, or German-Flute” published by James Aird in 1778. To create the training data for the RNN, I just combined all 1180 tunes in Aird’s six volumes — digitised by the great Jack Campin. I then trained a RNN with my CPU (on my slow MacBook Air) and the default parameters set by Andrej Karpathy:
-rnn_size size of LSTM internal state [100] -num_layers number of layers in the LSTM [2] -learning_rate learning rate [0.002] -decay_rate decay rate for rmsprop [0.95] -dropout dropout to use just before classifier. 0 = no dropout [0] -seq_length number of timesteps to unroll for [50] -batch_size number of sequences to train on in parallel [100] -max_epochs number of full passes through the training data [30] -grad_clip clip gradients at [5] -train_frac fraction of data that goes into train set [0.95] -val_frac fraction of data that goes into validation set [0.05] -seed torch manual random number generator seed [123] -print_every how many steps/minibatches between printing out the loss [1] -eval_val_every every how many iterations should we evaluate on validation data? [1000]

I sample the trained system using the CPU, some random seed and the default parameters:
-sample 0 to use max at each timestep, 1 to sample at each timestep [1] -length number of characters to sample [2000] -temperature temperature of sampling [1]

It is remarkable that the RNN has learned some of the rules of ABC. There are some errors however. For instance, the RNN produces the abc of “Lisl’s Stis” as
T:Lisl's Stis. M:9/8 L:1/8 Q:3/8=120 K:D g/|a>f) d2b |gfe dAB |G3 B2c|A2G FAc| d3 D3:|
Running abc2midi, produces the output:
writing MIDI file lislstis1.mid Warning in line-char 7-13 : Track 0 Bar 1 has 5 units instead of 9 Warning in line-char 7-25 : Track 0 Bar 2 has 6 units instead of 9 Warning in line-char 7-32 : Track 0 Bar 3 has 6 units instead of 9 Warning in line-char 7-40 : Track 0 Bar 4 has 6 units instead of 9 Warning in line-char 7-2 : Track 0 Bar 0 has 13/2 units instead of 9 in repeat Warning in line-char 7-13 : Track 0 Bar 1 has 5 units instead of 9 in repeat Warning in line-char 7-25 : Track 0 Bar 2 has 6 units instead of 9 in repeat Warning in line-char 7-32 : Track 0 Bar 3 has 6 units instead of 9 in repeat Warning in line-char 7-40 : Track 0 Bar 4 has 6 units instead of 9 in repeat
Clearly, the time signature should not be 9/8, but 6/8. The abc2midi tool gracefully fails, and fills in what was missing. Anyhow, most of the output of the RNN begins with the preface material, and ends with the music. Increasing the temperature beyond 1, or decreasing it below about 0.45 produces a lot of gibberish though.

Here is one piece it generates that is longer than those above, but at a sampling temperature of 0.45. This one I will add to my repertoire immediately: