Introducing, “The Black Box”!


The Mean Green Machine Folk Machine (TMGMFM, pictured left) is a Saltarelle Rivage III tuned DG (the outermost row is in D, and the second row is in G) with a five-button row extending the chromatic possibilities of the treble side. It is a LMMH, meaning it has one set of low reeds, two sets of middle reeds, and one set of high reeds. It’s a “fourth button start” (meaning the tonic D or G is the fourth button down from the chin). The bass-side has four extra buttons, giving F#/F, and A/G. Here’s the layout chart:

Screen Shot 2019-12-24 at 16.26.56.png

Here’s a video showing some of the chromatic possibilities with TMGMFM:

TMGMFM is well worth what I paid for it at Hobgoblin Music in London in January 2018. It is the first accordion I have played that I don’t have to fight against to make music.

I took TMGMFM to the 2019 Joe Mooney summer school. (Here’s my relevant blogpost.) Of maybe 50 accordionists, I was the only one playing a DG. All others were playing a BC, or a C#D. During the week I learned about what traditional ornaments are possible on TMGMFM. Cuts are no problem. Most triplets are fine. However, I found some rolls to be impossible. When I got back home I decided to study TMGMFM and see what was actually possible. Here’s a picture showing all pitches reachable by TMGMFM on the press (notes highlighted green) and on the draw (notes highlighted red):


The pitches covered in black do not exist on TMGMFM. There is no middle C natural (C4), or G below middle C (G3). The arrows point to pitches that can be rolled in a traditional Irish style. That is, rolling A4 involves playing A4-B4-A4-G#4-A4 all in the same direction. TMGMFM can roll G on both press and draw.  It can also roll F#, which is not possible on the BC accordion, but is on the C#D. One of the most common rolls, the high D roll, is not possible on TMGMFM. The rapid press-draw succession to get the right notes breaks the flow. This became clear in the first tune we learned at the summer school. All the other kids were happily producing their D rolls, while I had to settle for just the D, and at most a triplet.

So I began searching for an instrument that could accommodate Irish ornamentation. I could go to the BC system, but I would have to relearn just about everything again because the scales are quite different. Since I have spent many years learning to play the DG system (over 1000 hours on TMGMFM alone), I wanted to find a way to build from my hard-earned implicit knowledge.

A better possibility is the C#D system. I would already be familiar with the D row, knowing how to play in D and B minor. All the typical Irish rolls would be possible, but I would have to learn the other common modes, like G major, A dorian and E minor. So, I borrowed a C#D accordion for a month and immediately found all common rolls to be easily executable: the pattern stays the same for press or draw— one need only shift hand placement to roll different notes. But playing tunes that are not in D or B minor was difficult. Also, I found the harmonic possibilities on the treble side to be too limited, which makes it difficult to play English and Swedish traditional music.

I started to consider ordering a custom-built accordion, drooling over the possibilities with Castignari, Saltarelle, and other makes. But the cost would be about €4000. Then I realized something: in my collection (five accordions) I had a Hohner accordion with a lot of keys (pictured below). This Hohner has 7 more buttons on the treble side than TMGMFM, but 4 fewer on the bass side. It also has a flat keyboard. I love how it feels and sounds playing (it is a LMMMH), but it is a CF tuning, with a “club” system. I bought this used in 2011 in Copenhagen, not really understanding what it would entail to learn it. It moved to London with me, and then Stockholm; and at various points along the way I tried to sell it. Could modifying this accordion provide the solution I needed?


For a few weeks over summer I tried to figure our how I can adapt the tuning of this box such that it preserves what I have learned on TMGMFM, but accommodates traditional Irish ornaments. Starting with the layout of TMGMFM, I determined that the new box should be a fifth button start. Otherwise the treble side will have some very high “squeaky” pitches. This makes it possible to add the low notes missing on TMGMFM, e.g., the middle C. The major requirements that the new design must meet are the traditional rolls. To make D rolls I need C# on the press. I also wanted a D5 on the draw.

I tried out a variety of layouts, printing them to paper and physically testing the mechanics of the rolls to see if they are possible and not awkward. I devised a way to describe the fingering pattern of each roll based on what was comfortable to execute (described below). Considering these patterns as I changed pitches here and there helped me aim for sensible patterns that are somewhat uniform for each pitch. After maybe four designs, I finally settled on the one below:

Screen Shot 2019-12-24 at 16.42.00.png

The buttons in green are identical to those on TMGMFM (the bass side has no thirds). Most of the changes are in the third row. This seems to optimize the ornamentation possibilities and minimize the amount of new things I need to learn. Here is a comparison of the pitches available with this system and the rolls theoretically possible (pointed to by arrows) with those of TMGMFM:



We can see that the maximum length chromatic run in press or draw is now 11 notes instead of only five. And it can now reach all the traditional rolls, plus a few others that cannot be performed on either BC or C#D (e.g., rolls on G# roll, Eb, and Bb). All the low notes are there, but a squeaky F6 is not. All possible traditional rolls are tabulated below, with the different patterns involved in executing them.

Buttons (Pattern)
Pitch Press Draw
F#3 III3, II2, III1 (II3, I2, II1)
F#4 III6, III7, I5 (III2, III3, I1)
F#5 III9, III10, I8 (III2, III3, I1)
G3 II3, III4, III3 (I1, II2, II1)
G4 II6, III7, III6 (I1, II2, II1) III6, II6, II5 (II2, I2, I1)
G5 II9, II10, III9 (I1, II2, II1) III10, II10, II9 (II2, I2, I1)
G#3 II2, II4, II3 (I1, I3, I2) III2, II2, III1 (II2, I2, II1)
G#4 I6, II7, II6 (I1, II2, II1)
G#5 I9, II10, II9 (I1, II2, II1)
A3 III4, II4, II2 (II3, I3, I1) III3, II3, III2 (II2, I2, II1)
A4 III7, II7, I6 (III2, II2, I1)
A5 III10, II10, I9 (III2, II2, I1)
B3 II4, I4, III2 (II3, I3, III1) II3, III4, II2 (I2, II3, I1)
B4 III7, III8, I6 (III2, III3, I1)
B5 III11, III12, I10 (III2, III3, I1)
C4 I4, II4, II3 (I2, II2, II1)
C5 II7, I7, III7 (II1, I1, III1)
C6 I10, II11, II10 (I1, II2, II1)
C#4 III4, III5, I4 (III1, III2, I1)
C#5 III8, II8, II7 (II2, I2, I1)
C#6 III12, II12, II11 (II2, I2, I1)
D4 II5, III6, I4 (III2, III3, I1) II4, III5, III4 (I1, II2, II1)
D5 III8, III9, I7 (III2, III3, I1) I7, II8, III8 (I1, II2, III2)
D6 III11, III12, I11 (III1, III2, I1)
E4 III5, III6, I5 (III1, III2, I1)
E5 III9, III10, I8 (III2, III3, I1)
F5 I9, III10, III9 (I1, III2, III1)

Row I is closest to the bellows, and row III is furthest from the bellows. Button 6 is the 6th button down from the chin on row III, the 5th button down on row II, and the 3rd button down on row I. As an example, there are two ways to roll G4: with buttons II6, II7, III6 on the press or III6, II6, II5 on the draw. The patterns involved with the G4 rolls are (I1, II2, II1) on the press and (II2, I2, I1) on the draw, which are shown below (green for press and red for draw). The first element of the tuple is the position of the rolled note, or the “fulcrum”. The patterns for the G3 and G5 rolls on the press are the same as for G4 on the press. Most patterns that don’t fit within this configuration of buttons are not  comfortable to perform.

Screen Shot 2019-12-24 at 00.49.33.png

Satisfied with the design, I took the box to the accordion specialist Eric Simmons at Stockholms Dragspelsservice and showed him what I was thinking. He said he could do the work, reusing many of the reeds in the box, and adding a few others he had; but he looked very confused at the layout of the pitches. He asked a few times whether I was absolutely sure.

He finished the work after a month and I rushed to the shop to try it out. From my very first squeeze of The Black Box (TBB), I knew this was going to be almost perfect. Everything I learned to play on TMGMFM was easily playable on TBB. Most of the ornaments I had learned were the same, and some were a little different, but nothing difficult. And I could finally do the D rolls I had to skip over at the Joe Mooney summer school. Plus the bass side has some wonderful low notes! It’s got a big sound that really resonates small rooms. The loss of the four extra bass buttons is not a major one.

All the roll patterns that I would have to learn with this layout are shown below (G3p means rolling G3 on the press). It looks like a major amount of effort, but many patterns are quite similar. For instance, the patterns of the E4 and E5 rolls are just slightly different. Plus, I don’t need to learn all rolls in press and draw. All pitches of the D and G scales can be rolled. I list tunes that can feature some these rolls.

Pattern Notes Observations
(I1, II2, II1) G3p, G4p, G5p, G#4p, G#5p, D4d Very easy; practice in The Shaskeen Reel; The Gold Ring (jig); Ballydesmond #2 (polka)
(III2, II2, I1) A4p, A5p Very easy; practice in Christmas Eve (reel); Harvest Home (hornpipe); Tom Billy’s (jig)
(III2, III3, I1) D4p, D5p, F#4p, F#5p, E5d, B4d, B5d Very easy; practice in Crossing the Shannon (reel); The Humours of Tulla (reel); The Five Servants (polka)
(III1, III2, I1) E4d, D6p, C#4d Very easy; practice in Drowsy Maggie (reel)
(II3, I3, I1) A3p Easy; practice in The Gold Ring (jig)
(II2, I2, I1) G4d, G5d, C#5d, C#6d Easy; practice in Christmas Eve (reel, part C)
(II1, I1, III1) C5d Awkward but doable
(II2, I2, II1) A3d, G#3d Easy
(I2, II3, I1) B3d OK
(II3, I2, II1) F#3p OK
(I1, I3, I2) G#3p OK
(I1, III2, III1) F5d Difficult, but possible
(I1, II2, III2) D5d Awkward, avoid; use (III2, III3, I1) on the press instead
(II3, I3, III1) B3p Awkward, avoid; use (I2, II3, I1) on the draw instead

I have been playing TBB for about two months now, and have found only three downsides. First, it is heavy! TMGMFM weighs 5.4 kilograms, but TBB is 7.7 kg. It has a heavier case, and two more sets of reeds. This means there’s more mass I have to move when going from press to draw and vice versa. TMGMFM feels far more responsive and quick when I play it; but when TBB gets going, it sounds like a train — reminding me of the usual BC Paolo Soprani boxes. Addressing this downside involves some work in the gym, and maybe a bit of adjustment to the reeds to make them more responsive. The second downside is that TBB is physically big. This makes playing a little awkward at the low end in terms of wrist ergonomics. I have had to learn a different way of sitting with the instrument, and taking care of any pain I notice in my right thumb and wrist and making corrections. Because the instrument is so heavy I was at first relying on pressure from my right thumb to stabilize when going from draw to press, which was causing pain. So some conscious effort not to apply too much pressure on my thumb, and using my inner leg to provide some leverage and stability, makes the experience more comfortable. Traveling with the instrument (as I did to the US for winter break) was tiring, but the overhead space in the large transatlantic airplane was just large enough. (I wonder what will happen going to Ireland this summer.) The third downside is that playing TBB involves different roll patterns. This means I just have to spend more time working to make them uniform. But keeping the DG system and building upon what I have already learned is worth it!

I was excited to show off TBB to my accordion teacher. He was amused, saying it looks big and heavy, and that having to learn the different patterns for each roll is something he wouldn’t want to do. But after a few lessons he mentioned that my playing on TBB sounds far more traditional than on TMGMFM, and it seems the extra weight is providing a taming influence to my playing. Here’s a video of me playing some tunes on TBB (keep in mind I’m still learning!):

I am extremely happy with TBB — but of course, I’m looking for the next accordion: one that is physically smaller with the same button layout as TBB, but having only three sets of reeds (MMM). This would make it lighter and easier to travel with.

First Call for Papers: 2020 Joint Conference on AI Music Creativity (CSMC + MuMe)

First Call for Papers: 2020 Joint Conference on AI Music Creativity (CSMC + MuMe)

Oct 22-24 2020 @ KTH and KMH, Stockholm, Sweden

The computational simulation of musical creativity continues to be an exciting and significant area of academic research, and is now making impacts in commercial realms. Such systems pose several theoretical and technical challenges, and are the result of an interdisciplinary effort that encompasses the domains of music, artificial intelligence, cognitive science and philosophy. This can be seen within the broader realm of Musical Metacreation, which studies the design and use of such generative tools and theories for music making: discovery and exploration of novel musical styles and content, collaboration between human performers and creative software “partners”, and design of systems in gaming and entertainment that dynamically generate or modify music.

The 2020 Joint Conference on AI Music Creativity brings together for the first time two overlapping but distinct research forums: The Computer Simulation of Music Creativity conference (, est. 2016), and The International Workshop on Musical Metacreation (, est. 2012). The principal goal is to bring together scholars and artists interested in the virtual emulation of musical creativity and its use for music creation, and to provide an interdisciplinary platform to promote, present and discuss their work in scientific and artistic contexts.

The three-day program will feature two keynotes, research paper presentations, demonstrations, discussion panels, and two concerts. Keynote lectures will be delivered by Professor Emeritus Dr. Johan Sundberg (Speech, Music and Hearing, KTH, and Dr. Alice Eldridge (Music, Sussex University, UK,


We encourage submissions of work on topics related to CSMC and MuMe, including, but not limited to, the following:


  • systems capable of analysing music;
  • systems capable of generating music;
  • systems capable of performing music;
  • systems capable of (online) improvisation;
  • systems for learning or modeling music style and structure;
  • systems for intelligently remixing or recombining musical material;
  • systems in sound synthesis, or automatic synthesizer design;
  • adaptive music generation systems;
  • music-robotic systems;
  • systems implementing societies of virtual musicians;
  • systems that foster and enhance the musical creativity of human users;
  • music recommendation systems;
  • systems implementing computational aesthetics, emotional responses, novelty and originality;
  • applications of CSMC and/or MuMe for digital entertainment: sound design, soundtracks, interactive art, etc.


  • surveys of state-of-the-art techniques in the research area;
  • novel representations of musical information;
  • methodologies for qualitative or quantitative evaluation of CSMC and/or MuMe systems;
  • philosophical foundations of CSMC and/or MuMe;
  • mathematical foundations of  CSMC and/or MuMe;
  • evolutionary models for  CSMC and/or MuMe;
  • cognitive models for  CSMC and/or MuMe;
  • studies on the applicability of music-creative techniques to other research areas;
  • new models for improving CSMC and/or MuMe;
  • emerging musical styles and approaches to music production and performance involving the use of CSMC and/or MuMe systems
  • authorship and legal implications of CSMC and/or MuMe;
  • future directions of CSMC and/or MuMe.

Paper Submission Format

There are three formats for paper submissions:

  • Full papers (8 pages maximum, not including references);
  • Work-in-progress papers (5 pages maximum, not including references);
  • Demonstrations (3 pages maximum, not including references).

The templates will be released early 2020, and EasyChair submission link opened soon thereafter. Please check the conference website for updates:

Since we will use single-blind reviewing, submissions do not have to be anonymized. Each submission will receive at least three reviews. All papers should be submitted as complete works. Demo systems should be tested and working by the time of submission, rather than be speculative. We encourage audio and video material to accompany and illustrate the papers (especially for demos). We ask that authors arrange for their web hosting of audio and video files, and give URL links to all such files within the text of the submitted paper.

Accepted full papers will be published in a proceedings with an ISBN. Furthermore, selected papers will be invited for expansion and consideration for publication in the Journal of Creative Music Systems (

Important Dates

Paper submission deadline: August 14 2020
Paper notification: September 18 2020
Camera-ready paper deadline: October 2 2020

Presentation and Multimedia Equipment:

We will provide a video projection system as well as a stereo audio system for use by presenters at the venue. Additional equipment required for presentations and demonstrations should be supplied by the presenters. Contact the Conference Chair ( to discuss any special equipment and setup needs/concerns.


At least one author of each accepted submission should register for the conference by Sep. 25, 2020, and attend the workshop to present their contribution. Papers without authors will be withdrawn. Please check the conference website for details on registration:

About the Conference

The event is hosted by the Division of Speech, Music and Hearing, School of Electrical and Computer Engineering (KTH) in collaboration with the Royal Conservatory of Music (KMH).

Conference chair: Bob L. T. Sturm, Division of Speech, Music and Hearing, KTH
Paper chair: Andy Elmsley, CTO Melodrive
Music chair: Mattias Sköld, Instutitionen för komposition, dirigering och musikteori, KMH
Panel chair: Oded Ben-Tal, Department of Performing Arts, Kingston University, UK
Publicity chair: André Holzapfel, Division of Media Technology and Interaction Design, KTH
Sound and music computing chair: Roberto Bresin, Division of Media Technology and Interaction Design, KTH

Questions & Requests

Please direct any inquiries/suggestions/special requests to the Conference Chair (


Prediction! “Banal commercial music”

Almost as an afterthought, Hiller and Isaacson write in the chapter “Some future musical applications” in their 1959 book Experimental Music (a book documenting their experiments in music generation by a computer, available online here):

It is also necessary to take note of one less attractive possibility [of applying computers to composing music], but one which must also at least be mentioned, since it is so often suggested. This is the efficient production of banal commercial music. … Belonging in a somewhat similar category is the frequently asked question of whether synthetic Beethoven, Bartók or Bach might also be produced by computers… The goal rather than the means appears objectionable here, however. The conscious imitation of other composers, by any means, novel or otherwise, is not a particularly stimulating artistic mission. Moreover, this type of study is, in the final analysis, a logical tautology, since it produces no information not present initially.

I don’t agree with those last two statements, but it is fun to read the musings of these two pioneers of computer-generated music 60+ years ago.


Reading List for FDT3303: Critical Perspectives on Data Science and Machine Learning (2019)

The course is fully booked, with 23 students and a few auditors. We have a very good crop of papers for this inaugural edition of my PhD course. Some of these are classic papers (bold). Some are very new ones (italic). All deserve to be read and critically examined!

Nov. 1: Questions of Ethics
J. Bryson and A. Winfield, “Standardizing ethical design for artificial intelligence and autonomous systems,” Computer, vol. 50, pp. 116–119, May 2017.

Nov. 13: Questions of Performance
E. Law, “The problem of accuracy as an evaluation criterion,” in Proc. Int. Conf. Machine Learning: Workshop on Evaluation Methods for Machine Learning, 2008.

F. M.-Plumed, R. B. C. Prudêncio, A. M.-Usó, and J. H.-Orallo, “Making sense of item response theory in machine learning,” in Proc. ECAI, 2016.

Nov. 15: Questions of Learning
D. J. Hand, “Classifier technology and the illusion of progress,” Statistical Science, vol. 21, no. 1, pp. 1–15, 2006.

E. R. Dougherty and L. A. Dalton, “Scientific knowledge is possible with small-sample classification,” EURASIP J. Bioinformatics and Systems Biology, vol. 2013:10, 2013

Nov. 20: Questions of Sanity
I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” in Proc. ICLR, 2015

S. Lapuschkin, S. Wäldchen, A. Binder, G. Montavon, W. Samek & K.-R. Müller, “Unmasking Clever Hans predictors and assessing what machines really learn” Nature 2019

Nov. 22: Questions of Statistics
C. Drummond and N. Japkowicz, “Warning: Statistical benchmarking is addictive. kicking the habit in machine learning,” J. Experimental Theoretical Artificial Intell., vol. 22, pp. 67–80, 2010.

S. Makridakis, E. Spiliotis, V. Assimakopoulos, “Statistical and Machine Learning forecasting methods: Concerns and ways forward“, PLOS ONE 2018.

S. Goodman, “A Dirty Dozen: Twelve P-Value Misconceptions”, Seminars in Hematology, 2008.

Nov. 27: Questions of Experimental Design
C. Dwork, V. Feldman, M. Hardt, T. Pitassi, O. Reingold, and A. Roth, “The reusable holdout: Preserving validity in adaptive data analysis,” Science, vol. 349, no. 6248, pp. 636–638, 2015.

T. Hothorn, F. Leisch, A. Zeileis, and K. Hornik, “The design and analysis of benchmark experiments,” Journal of Computational and Graphical Statistics, vol. 14, no. 3, pp. 675–699, 2005.

Nov. 29: Questions of Data
M. J. Eugster, F. Leisch, and C. Strobl, “(Psycho-)analysis of benchmark experiments: A formal frame- work for investigating the relationship between data sets and learning algorithms,” Computational Statistics & Data Analysis, vol. 71, pp. 986 – 1000, 2014.

Luke Oakden-Rayner, Jared Dunnmon, Gustavo Carneiro, Christopher Ré, “Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging” arXiv 2019

S. Tolan, “Fair and unbiased algorithmic decision making: Current state and future challenges,” JRC Technical Reports, JRC Digital Economy Working Paper 2018-10, arXiv 2018.

Dec. 4: Questions of Sabotage
J. Su, D. V. Vargas, and K. Sakurai, “One pixel attack for fooling deep neural networks,” arXiv, vol. 1710.08864, 2017

S. G. Finlayson, J. D. Bowers, J. Ito, J. L. Zittrain, A. L. Beam, I. S. Kohane, “Adversarial attacks on medical machine learning” Science 2019.

Dec. 6: Questions of Interpretability
Z. Lipton, “The mythos of model interpretability,” in Proc. ICML Workshop on Human Interpretability in Machine Learning, 2016

Malvina Nissim, Rik van Noord, Rob van der Goot, “Fair is Better than Sensational: Man is to Doctor as Woman is to Doctor” arXiv 2019.

Dec. 11: Questions of Methodology
Z. C. Lipton and J. Steinhardt, “Troubling trends in machine learning scholarship,” in Proc. ICML, 2018.

Meyer, Michelle N., “Two Cheers for Corporate Experimentation: The A/B Illusion and the Virtues of Data-Driven Innovation” 13 Colo. Tech. L.J. 273 (2015).

Dec. 13: Questions of Application
K. L. Wagstaff, “Machine learning that matters,” in Proc. Int. Conf. Machine Learning, pp. 529–536, 2012.

Cynthia Rudin, David Carlson, “The Secrets of Machine Learning: Ten Things You Wish You Had Known Earlier to be More Effective at Data Analysis” arXiv 2019.

M. Fernández-Delgado, E. Cernadas, S. Barro, and D. Amorim, “Do we need hundreds of classifiers to solve real world classification problems?,” Journal of Machine Learning Research, vol. 15, pp. 3133–3181, 2014

Machine Folk from GPT-2!

It was just a matter of time before someone retrained the massive GPT-2 language model (up to 1.5 billion parameters) on folk music. That’s what Gwern Branwen has done, described in this comment thread. I’m not sure which size model he has used, but for training he seems to have used a concatenation of four folk-rnn datasets. In this post I want to analyze some of the samples generated from the resulting model to help me determine whether there’s much difference in quality compared with transcriptions generated by folk-rnn models (100,000 examples here).

The file Gwern links to contains 12929 lines. I will generate 5 random numbers in [1, 12929] and then analyze the transcriptions closest to those line numbers.

Here’s the transcription closest to line number 2881:

X: 47015
T: Money In Both Pockets
M: 4/4
L: 1/8
K: Gmaj
dggf gdBd|g2 bg agfg|dggf g2 bg|afge d2 Bc|
dggf gdBd|g2 bg agfg|dggf g2 bg|afge d2 dc||

Here is the tune notated:

Screen Shot 2019-10-24 at 08.54.17.png

The transcription appears to be a novel one. The training data has a tune and settings with the same name, but these are quite different from the above. This generated tune has a conventional AABB structure (assuming implicit repeat signs). Within each part there is some repetition and variation consistent with the style. The B part goes higher in register than the A part, which is also stylistically consistent. Both parts lack a cadence, however, sticking to the dominant. It is very easy to play on the accordion:

Overall, this is a boring tune since there is too much repetition and not enough variation. But I would say it is plausible since there is nothing clearly wrong with it.

The two transcriptions generated by the model that follow the above are given the same title by the model:

X: 47016
T: Money In Both Pockets
M: 4/4
L: 1/8
K: Gmaj
|: G3A BGGB | dBGB AGFA | G3A BGBc | defd BGBc |
GFGA BGGB | dBGB AGFA | G3A BGBc | defd Bd d2 |
|g2 gf gdBd | (3efg fg agfg | gagf gabg | afge d2 (3Bcd |
g2 gf gdBd | (3efg fg agfg | gagf gabg | afge d3c ||

X: 47017
T: Money In Both Pockets
M: 4/4
L: 1/8
K: Gmaj
|: G3A BGGB | dBGB AGFA | G3A BGBc | defd BGBc |
GFGA BGGB | dBGB AGFA | G3A BGBc | defd Bcdf |
g2 gf gdBd | efge fgaf | gagf gabg | afge d2 Bc |
g2 gf gdBd | efge fgaf | gagf gabg | afge d3c ||

This behavior likely comes from the fact that in the dataset we created from many tunes have “settings”, which are variations of a tune. So the titles will repeat and the tune is varied. Let’s look at how these three transcriptions relate by notating them as three voices:

Screen Shot 2019-10-24 at 09.11.07.png

We can see the model has actually generated two reasonable variations of the tune. This is very impressive behavior! Neither setting, however, improves on the first.

Here’s the generated transcription closest to line number 8623:

X: 33166
T: Breakfast Epiphanies
M: 4/4
L: 1/8
K: Dmaj
(3def|:gecA G2FE|DFAF GBAG|FDAD BDAB|FAAG (3ABA (3def|
Geed cAA2|dedA BAGB|Ad~d2 ed (3Bcd|(3efe (3dcB A2cd|
ea~a2 bgaf|gedB cAGB|Ad~d2 ed (3Bcd|1(3efe d2 d2 (3ABc:|2(3efe d2 d2|

I love that title, but it actually appears in the training data. Here’s the dots for the generated transcription:

Screen Shot 2019-10-24 at 09.17.33.png

This tune again has an AABB structure with the register of the second going higher than the first. We see variation and repetition. The A part starts in a very interesting way, going V to IV to I, but it has too much repetition. The B part frankly doesn’t make much sense, and doesn’t relate to the A part. I find this part so strange that I’m not going to bother learning to play this tune. I would say this transcription is not plausible.

Here’s the generated transcription closest to line number 7798:

X: 47628
T: Shoe The Donkey
M: 2/4
L: 1/8
K: Gmaj
|: G/ |B>c BA | AG E/F/G/A/ | B>c BA | d/e/d/B/ G>G |
B>c BA | AG E/F/G/A/ | Bd ef | gB G3/ :|
|: d/ |Bd dB | ce ed/c/ | Bd c/B/A/G/ | AG E2 |
Bd dB | ce e/f/e/d/ | Bd ef | gB G3/ :|

The model has again plagiarized this name from tunes in the training data, but it has created a good polka with coherence.

Screen Shot 2019-10-24 at 09.53.53.png

The B part resembles a part in The Banks of Inverness. The sixteenth note pickup is kind of strange. The A and B parts ending in the same way give a nice coherence. And it’s pretty easy to play on the accordion:

The only changes I would make is to the pickups, raising the G to an A before bar 1 and at the end of bar 4, and lowering the D to a C before bar 9.

Here’s the generated transcription closest to line number 1187:

X: 97541
T:The Victory
|: E G | c G E G F A d c | B d g f e c G E | F E D C B, D G, B, | 
C E G c B G F E | D E F G A c B c | d c B A G F E D | 
C E D F E G c e | (3 d e d c B c 2 :| 
|: B c | d G e G f G e G | d G e d c B A G | e c f d g e c A | 
B c d e d 2 B c | d G e G f G e G | d G e d c B A G | 
f e d c B d c A | (3 B A G (3 F E D C 2 :|

Whereas all the transcriptions we have seen so far resemble the training data we used to create the folkrnn v1 model, this one resembles the dataset we used to create the 2nd version — where we transposed and tokenized the transcriptions. Removing the spaces, rebeaming, and transposing to D produces this transcription:

Screen Shot 2019-10-24 at 09.39.34.png

This is different from The Victory in the training data. Again we see an AABB structure. There are cadences, but the one in the A part is unexpected because the part is by and large aimless. The B part is better, and is plausible. The two parts do not relate. I don’t want to spend time learning to play this.

Finally, here’s the transcription at line 7929:

Screen Shot 2019-10-24 at 09.46.31.png

This one is a total failure (and again the model has plagiarized the name). The counting is wrong in all bars except one. The melody doesn’t make a lick of sense.

So of the five transcriptions above, two are plausible. The polka is actually pretty good! All titles by GPT-2 are plagiarized, but I haven’t found much plagiarism in the tunes themselves.

In a future post I will select five transcriptions at random created by folk-rnn (v2) and perform the same kind of analysis. Will the quality of the transcriptions be as good as these ones created by GPT-2? What is gained by increasing the number of model parameters from millions to hundreds of millions, and using a model pretrained on  written English text?