The First Machine Folk Music School!

I’m very excited to be offering the first “Machine Folk Music School” on Sep. 13 (15-16h CEST) as part of the 2020 Ars Electronica festival:

During this hour-long “school” (over zoom) I will teach participants a machine folk tune in the aural tradition. This involves me first performing the tune a few times through to give everyone a feeling for it. Then we work gradually phrase by phrase, playing slowly and with plenty of repetition. We link together the phrases to build up parts of the tune. Then we put all parts together and repeat the entire tune several times slowly. Finally, I discuss several possibilities for variation of the tune. Participants will also be given a short tunebook containing several other machine folk tunes.

Introducing Bosca Dubh!

The Bosca Dubh (circa 2020)

I’ve been living with Bosca Dubh for a month now and learning about its unique personality. Bosca Dubh is the next generation of The Black Box, which is essentially a D/G diatonic accordion redesigned to allow traditional Irish ornamentation. Bosca Dubh began its life as a club accordion, apparently designed by Excelsior. There are no identification numbers anywhere on the box or inside. I contacted Excelsior for information about this box, but they have no idea about it other than it could be one of a few specimens of a test model. This makes it even more unique and mysterious.

Compared to The Black Box, Bosca Dubh has two more treble-side buttons (on the inner-most row), and four more bass-side buttons. Here’s the layout:

The green buttons are identical to those of The Black Box. The other buttons are new. Here are the ranges of the two boxes, also showing the rolls that are possible:

The range of Bosca Dubh is one semitone lower than The Black Box, and has the high f natural. Bosca Dubh also can do rolls on F3 and G3. Neither box allows a traditional-style roll on F4 — but a triplet suffices. Everything else is the same on the treble side.

The bass side of Bosca Dubh is expanded. Like The Black Box, all thirds are removed. But Bosca Dubh now has A and F# on the press, and F and C# on the draw. These add lovely color to tunes in the keys/modes common to Irish traditional dance music. The C# is useful for tunes in A, and the F# is useful for tunes in D. The F adds a nice modal flavor to tunes in G.

Like The Black Box, the Bosca Dubh has LMMMH reeds (L = octave below; M = middle; H = octave above); but it has more couplers. Here’s a picture of all the voicings available on Bosca Dubh. (The dots and their position denote which reeds are active.)

The coupler pressed is “ALL REEDS GO!”. One can also choose just L, or just M, or just H. Then there are four couplers that choose combinations of these. There are three couplers choosing combinations of the middle reeds — let’s call them Ml (middle low), Mm (middle middle), and Mh (middle high). “Traditional” Irish tuning makes the Mm reeds right on concert pitch and then the Ml and Mh are detuned relative to that by up to 15 cents, lower or higher, respectively. This makes a “wet” or warbly sound. (Some traditional players, like Jackie Daly, use very little to no detuning of these reeds.)

The Ml+Mm coupler selects the middle reeds located in a cassotto chamber. This is another unique aspect of Bosca Dubh over The Black Box. Cassotto makes for a very mellow sound. Here’s a picture of the treble side showing the reed batteries in the cassotto chamber (top):

The most odd feature of Bosca Dubh is the coupler isolating Ml+Mh. The expert that converted this accordion (Erik Simons, highly recommended!) believes this is a mistake of the manufacturer. I’ve never seen such a coupler before. BUT, I love the sound. I call it the “circus setting”. This “mistake” lends credence to the theory that this box was a test model.

Now for a demonstration of Bosca Dubh, including its “circus” setting:

How does Bosca Dubh compare with the typical B/C accordions played in Irish traditional music? Below are the ranges of Bosca Dubh compared with the standard B/C layout of the popular Paolo Soprani boxes:

We can see several things. Bosca Dubh doesn’t go as low as the B/C, but it does go higher. The longest chromatic run of notes on the B/C is two on the press and four on the draw; but on Bosca Dubh it is ten on the draw and five on the press (in the middle of the range). Each roll on the B/C can only be performed in one direction of the bellows, but on Bosca Dubh many rolls can be performed in both directions. The only rolls Bosca Dubh cannot perform that the B/C can are E3 and F4. The B/C cannot perform rolls on C# or F#.

Another typical system in Irish traditional accordion is C#/D, which is just tuned a semitone higher than the B/C. Thus it shares much of the same characteristics:

As for the B/C, Bosca Dubh doesn’t go as low as the C#/D. As for the B/C, the longest chromatic run of notes on the C#/D is two on the press and four on the draw. Each roll on the C#/D can only be performed in one direction of the bellows, but on Bosca Dubh many rolls can be performed in both directions. Unlike the B/C system, the C#/D can perform rolls on C# and F#, but not on F natural.

A big advantage of Bosca Dubh and The Black Box over B/C and C#/D accordions is the expanded harmonic possibilities on the treble side. Many more note combinations are possible, which makes them versatile instruments for accompaniment. Here’s a table showing several of the chords that can be played on the treble side: “M” is major, “m” is minor, “M7” is major with raised 7th, and “dim” is minor with diminished fifth.


For a given root, the major 7 chord resolves to the IV, while the diminished chord resolves to the V. So in Irish traditional music, several of these wouldn’t be useful, e.g., C#dim, Adim, B7 and Bdim. More often, however, intervals of octaves, fifths and fourths are used on Irish accordion.

In conclusion, the expanded bass on the Bosca Dubh is the biggest and most useful change from The Black Box. The two additional buttons on the treble side aren’t really that useful. The cassotto on Bosca Dubh is very nice, as is the organ/melodeon sound. The circus setting is a fun unique one. If I were to look toward the next design, I might trade some of the high notes for the low ones available on B/C and C#/D, e.g., remove everything from the Eb6 up and add in the B2 to D3. This however would make the box have a sixth button start, shifting my hand position down one, which might actually be beneficial ergonomically.

Coming soon …

At left is The Black Box. It has been perfect! My design of the keyboard has worked out so well. The left hand side, however, is a bit limited. I really miss the F/F# that is on The Mean Green Machine Folk Machine (now sold to a new loving family). So, enter the box at right (so far unnamed). The treble side has two more buttons, but the bass side has four. It’s now in the shop being converted to my new design! Details coming soon…

An analysis of the 365 double jigs in O’Neill’s, pt. 10

This is part 10 of my live blogging analysis of the 365 double jigs in O’Neill’s 1001. In the last part, I revise and tune the procedure by which I extract time-pitch series from the collection, and then analyze several examples. The part before that reviews where I have been.

While reading O’Neill’s “Irish minstrels and musicians: with numerous dissertations on related subjects” (1918), I found the following quoted from “A History of Music in England” by English composer Earnest Walker (1907). I believe it really encapsulates  implicit and explicit properties of Irish traditional music:

Few musicians have been found to question the assertion that Irish folk-music is, on the whole, the finest that exists; it ranges with wonderful ease over the whole gamut of human emotion from the cradle to the battlefield, and is unsurpassed in poetical and artistic charm. If musical composition meant nothing more than tunes sixteen bars long, Ireland could claim some of the very greatest composers that have ever lived; for in their miniature form the best Irish folk-tunes are gems of absolutely flawless lustre, and though of course some of them are relatively undistinctive, it is very rare to meet with one entirely lacking in character. (pg. 335)

I wonder if the convention of Irish tunes being sixteen bars long relates to physical limits of human memory, and the aural transmission of tunes?

Anyhow, in this part I investigate extracting a feature complementary to time-pitch series: one describing rhythmic aspects of a transcription. Let’s consider jig #201 (“Biddy’s wedding”):

Screen Shot 2020-03-27 at 18.08.18.png

We see notes of four different durations. From shortest to longest: semiquaver, quaver, dotted quaver, and crotchet. In the entire collection, there are notes of three other durations: triplet semiquaver, triplet quaver, and dotted crotchet. One way to describe the rhythm of any transcription in this collection is by expressing it as a sequence of encoded durations where. Let’s try a simple one: 1 means a triplet semiquaver, 2 means a semiquaver, …, and 7 means a dotted crotchet. So the series describing “Biddy’s wedding” starts: 5, 2, 6, 4, 4, 4, 4, 4, 4, …

Even better would be an encoding that directly describes time. Dividing each quaver of a 6/8 measure into 6 segments (Fs = 6 segments/quaver) results in a triplet semiquaver lasting 2 segments, a semiquaver lasting 3, a triplet quaver lasting 4, a quaver lasting 6, a dotted quaver lasting 9, a crotchet lasting 12, and a dotted crotchet lasting 18. Hence, such a series extracted from “Biddy’s wedding” starts: 9, 3, 12, 6, 6, 6, 6, 6, 6, … In this way we can easily isolate measures, and accumulate the values of the series to create a series of onset times, i.e., for “Biddy’s wedding”: 0, 9, 12, 24, 30, 36, 42, 48, 52, 58, … Call this indexed series O.

The problem with both of these approaches is that the length of a series is equal to the number of notes in a transcription. I want a feature that facilitates the comparison of transcriptions and their parts. This can be done by making all series the same length. Hence, I create the onset-time series of length Fs*6*8 = 288 according to the following:

o = 1_{O}(n) : n \in [1,288]

using the indicator function. So the onset-time series is just a series of 288 ones and zeros. For “Biddy’s wedding” the onset-time series looks like:


Each spike shows an onset. Those of the A part are shown in blue and orange, and those of the B part are shown in green and red. It’s hard to differentiate between the series, so let’s view these in an alternative way:


Series 1 and 2 are from part A and 3 and 4 are from part B. Time in each sequence is going along the x-axis, six steps for each quaver, six quavers for each measure, and 8 measures for each series. Series-time in each tune is going along the y-axis, where each part contributes two series since there are repetitions. There is a change in pixel value where an onset occurs.

Let’s have a look at some others. Below is the onset-time series for jig #24 (“The maid at the well”):


This shows each part is built from two measures with the same rhythm: 10 quavers and a crotchet. Here’s the dots:

Screen Shot 2020-04-04 at 11.36.06.png

The onset-time series below shows a sequence of 22 quavers and a crotchet:


This is from jig #32 (“The basket of Turf”):

Screen Shot 2020-04-04 at 11.40.14.png

And here we see quavers all the way:


Those are the onset-time series of jig #125 (“Wasn’t she fond of me?”):

Screen Shot 2020-04-04 at 11.47.40.png

The pickup doesn’t appear in the onset-time series (or any of the other series) because it does not occur in any repetitions. Its existence is seen in the time-interval series, however (start of blue line):


Here’s a strange pattern of onset times:


That’s from jig #357 (“The Hibernian jig”). The dots show what is going on:

Screen Shot 2020-04-04 at 14.22.52.png

For some reason, O’Neill has explicitly notated an exaggerated jig rhythm. The transcription could also be notated with straight quavers and interpreted in the manner above.

How many jigs in this collection have such exaggerated jig rhythms? Only 14 of the 365 notate the dotted quaver semiquaver rhythm at least 8 times: #8, 76, 101, 148, 181, 201, 212, 222, 229, 256, 257, 294, 322, and 357. Jig #101 (“The idle road”) is one of these I looked at in part 5, which is played by Joe Burke with an unbroken rhythm.

Here’s another interesting one:


This is of jig #95 (“The sheep on the mountains”). This is the only jig in the collection with a structure different from all the others: ABAB, where each part is 16 measures long:

Screen Shot 2020-03-14 at 15.05.12.png

Here’s another, of jig #200 (“Daniel of the sun”):


The B part of this tunes looks to be more syncopated than the other two parts. The transcription shows plenty of broken rhythms, including Scotch snaps, in these parts:

Screen Shot 2020-04-10 at 13.36.29.png

The below are the onset-time series for jig #226 (“Tim Hogan’s jig”):


It appears that the parts become more and more dense with notes. Here’s the transcription:

Screen Shot 2020-04-10 at 19.35.54.png

We see the crotchets of the A part are notated as trilled, or rolled. The B part features mostly quavers. Then the C part has semiquaver passing tones.

Here’s another unusual one:


That is for jig #361 (“The Drogheda weavers”). There appears to be a flourish of notes starting each four measure section of the B part. Here’s the transcription showing what is happening in that part.

Screen Shot 2020-04-10 at 19.40.52.png

By way of summary, let’s look at the ways in which we can describe the characteristics of a give transcription from O’Neill’s 1001. Let’s consider one of my favorite jigs, “Scatter the Mud” (#187). Here’s the ABC notation:

d|eAA B>(cB/A/)|eAA ABd|eAA B>(cB/A/)|dBG GBd|
eAA B>(cB/A/)|eAA AGE|GAB Bge|dBA A2:|
|:d|eaa egg|dBA ABd|eaa egg|dBG GBd|
ea^f ({a}g2)e|dBA AGE|GAB Bge|dBA A2:|

And here’s the dots:

Screen Shot 2020-04-12 at 09.08.49.png

Here’s the onset time series of “Scatter the Mud”:


Here’s the time-pitch series:


As is convention, the B part goes higher in pitch than the A part. And the two parts echo each other in the middle and end. Here’s the time-interval series:


And finally, here’s the circular autocorrelation of the time-interval series:


We see the A part is has high self similarity at lags of one measure, and the greatest value at four measures (other than zero lag). The B part has more self-similarity with a lag of  two measures.

In the next part, I think I will look at clustering transcriptions according to their onset time series.

Seán Ó Riada on the Accordion in Irish Traditional Music

Seán Ó Riada is one of the most important Irish composers of the 20th century, and a key figure in the revival of Irish traditional music. In 1960, he assembled a group of traditional Irish musicians, named “Ceoltóirí Chualann“, to present traditional music in a classical music concert setting. They gave several influential concerts, and the group is considered a precursor to one of the greatest modern Irish music groups, The Chieftains, who have had 18 Grammy Award nominations.

In 1963, Ó Riada recorded a series for Raidió Éireann called “Our Musical Heritage”, in which he introduces and discusses Irish traditional music and its elements. In one of these he discussed the button accordion. I can’t find any transcription of his commentary online, but I love it so much I will transcribe it here.

Ó Riada prefaces his commentary with the following:

First of all, it needs to be emphasized over and over again, that Irish traditional instrumental music is a very close relation of Irish vocal music; that is, sean-nós [old-style] singing. The instruments which suit Irish music best are therefore those that most closely approach the personal expression of the human voice.

The fiddle is ideal. The player is in contact – in complete contact – with his instrument. The notes do not exist until he makes them; and his tone is a completely individual thing, differing from another fiddle player’s tone as much as one voice differs from another. This is also true to varying extents of the uilleann pipes, the flute, and the whistle.

Irish music is entirely a matter of solo expression, and not of group activity. It is the direct expression of the individual musician or singer. It is again very much a matter of personality. Whether that personality exists or not outside the music. That is to say, a singer, a piper, or a fiddler may be quite an unpleasant person when not performing but when performing it is his music personality which counts, which impresses us – the direct expression of his musical personality. Everything that comes in the way of that direct expression beclouds and confuses it.

Now, the most direct means of expression in music in the human voice. Next, in varying degrees, as I said, come the uilleann pipes, fiddle, flute, and whistle. In each of these the player makes the notes himself. He is in control. The notes do not exist  until he makes them. The fiddle player and the piper make the notes with their hands. The flute player and whistle player, with their mouth and hands. They are at all times directly in contact with the actual notes they make. And as a result, they are the masters of the notes. They control them. Varying their loudness and their softness. Their tone quality, and even their intonation.

Then Ó Riada is ready to render his judgement:

This, the accordion player cannot do. He does not make the notes – they are already there before him. Ready to sound at the pressing of a button, produced in an almost entirely mechanical fashion. Thus, he has not the control over his instrument that the others have. He has only to press a button and pull or push the bellows and the note sounds for him. The tone and even the intonation have already been decided for him by the maker. Because of this, individual musical expression becomes extremely difficult, if not impossible for him. For this reason, if not for any other, the use of the accordion as a solo instrument in Irish traditional music is to be greatly deplored.

Most accordion players are so hampered by their choice of instrument as to be unable to produce anything but a faint, wheezy imitation of what Irish music should be. And the most unfortunate part of it is, that this instrument, designed by foreigners for the use of peasants who had neither the time, inclination or application to learn a more worthy instrument – this instrument is not just losing favor, but gaining vast popularity throughout the country. The reason for this is mainly, I think, the laziness which afflicts us as a nation at the moment.

We would all like to be musicians, but we don’t want to take the trouble. It is easier to play notes which are already made for us, than to make our own notes. Accordions, bigger and better accordions, and eventually the greatest abomination of all – the piano accordion – nothing could be farther from the spirit of Irish traditional music.

However, I’m afraid this has been a rather long digression. As I said, very few accordion players in this country can surmount the difficulties inherent in their instrument. Most feel on the other hand that something must be done to enable them to produce more expression on the accordion. As this can’t be done by means of varying the tone, and so forth, they have turned to the one thing which it is possible to exploit, namely ornamentation. And it is precisely with regard to ornamentation that accordion players have committed their greatest crimes. In recent years, a technique and style of chromatic ornamentation, utterly alien from the spirit of Irish music, has grown up.

But before I describe it, let me mention briefly the two basic principles of ornamentation. And incidentally, I did not invent these principles. These principles are based on practice – the practice of the best players under the best circumstances. They are not invented principles, they are merely observed principles.

And the first is: generally speaking, no ornament should go outside the mode of the song or tune in which it occurs. And the second is: no ornament should, by its position, draw attention to an irrelevant note in the phrase in which it occurs. As by doing so it destroys the basic shape of the phrase.

At this point, Ó Riada uses the piano to illustrate permissible and impermissible ornamentation. He then caricatures the chromatic ornamentation he was hearing performed by the very influential Irish accordion players of the time, i.e., Paddy O’Brien and Joe Burke (though he does not name names). These players “throw in as many semitones” as they can. Eventually, Ó Riada renders a simple tonal phrase in the key of G into an unrecognizable chromatic mess [QED]. He continues:

The worst feature of it, to my mind, is not so much the incidental semitones, as is the dreadful habit they’ve got of using the downward semitone-inflected mordent, where you begin on a note, go to the semitone below and back to the note. Funnily enough, it is far more common than the upward-inflected mordent, where you begin on the note and go to the next note above it.

So the main downfall of the present day accordion players is the downward-semitone inflected mordent. This kind of thing is of course complete and utter rubbish; and it is up to the musical public to make their disapproval felt.

As I said, there are very few accordion players in this country who can sufficiently overcome the disabilities and limitations of their instrument. So as to make what they play sound like Irish music. But one of these few players is Sonny Brogan of Dublin. He is a man who understands the limitations of his instrument, but who strives to counteract these not in a mishmash of wrongly placed ornamentation, but by emphasizing the most traditional elements in the tunes he plays. His ornamentation is simple usually confined to the single cut, or grace note, and the roll.

Ó Riada then plays recordings of Brogan playing the reels “Repeal of the Union”, “The hut in the bog” and “Gordon’s reel”, and finally the jig “Morrison’s”. He highlights Brogan’s use of variation.

To sum up then, the accordion has been played in this country – the two row button accordion, that is – for upwards of 40 years. And I’m afraid that it has come to stay. However, while I have emphasized its unsuitability for solo playing, it can be a most  useful instrument in a band – something about which I am going talk next week. As a proverb says, it’s an ill wind. If only most Irish accordion players would try to fit in with the tradition instead of flying in the face of it, something would be achieved.

And one last word about the accordion: I wish, and indeed I wish again, that all Irish accordion players would drown, muffle, destroy, subdue or in some other fashion, silence the bass of their instrument. I haven’t yet heard an accordion player who knew the right bass to play, and it’s far better to play no bass anyway. It only interferes with the tune and confuses it.

Ó Riada continues his programme by talking about the concertina, which he finds to be superior to the accordion for Irish traditional music (e.g., “it’s not one tenth as unwieldy as the accordion”), and laments its decline.

One repercussion of my research in applying AI to model transcriptions of Irish traditional dance music is that I have become a dedicated student of Irish accordion. But I take no offense to any of Ó Riada’s verdicts and criticisms. Some of them are clearly laughable, such as peasants too busy to learn a “more worthy” instrument, and his nation “afflicted” with laziness. Some are uncomfortably nationalistic, such as those instrument-making foreigners. Some are contradictory, such as when he lauds the concertina over the accordion while overlooking that concertinas and accordions were being made by the same foreigners, and that the concertina involves the exact same mechanics as the accordion. And some are curiously unfair, such as overlooking the great expression that can be accomplished with the bellows. At least the accordion can produce dynamics like the human voice, which is not possible on the uilleann pipes – a more “worthy” instrument for Ó Riada. I am however persuaded by his opinion on some approaches to  playing bass on the accordion. I think sparse is the best approach, and only if it fits harmonically.

Ó Riada’s main argument with the fashion of accordion playing at his time is focused on music theory: the “great crime” of downward semitone-inflected mordents. Therein lies Ó Riada’s great crime: using a music theory that is in and of itself foreign to Irish traditional music to castigate contemporary practices of Irish traditional musicians.

I see Ó Riada’s programme on the accordion as a wonderful time capsule from just before Irish traditional music began its transformation into a major economic resource for Ireland – something that is due in large part to Ó Riada. The accordion would soon become a principal instrument of Irish traditional music. Controversy around the accordion would be replaced with controversy around the guitar and the bodhran, group playing, and eventually commercialization – the latter of which was as vigorously denounced by more modern “gate keepers” as Ó Riada denounces the accordion, e.g., Tony MacMahon in his wonderful 1996 essay, “The Language of Passion“.

An analysis of the 365 double jigs in O’Neill’s, pt. 9

This is part 9 of my live blogging analysis of the 365 double jigs in O’Neill’s 1001. The last part reviews where I have been. In this part I look at the time-pitch series of the collection. I create these series by single nearest neighbor regression on tuples of pitch and time observations extracted from a transcription. As an example, here is jig #201 (“Biddy’s wedding”):

Screen Shot 2020-03-27 at 18.08.18.png

Its four pitch-time series appear like so:

This feature has a clear relationship to the transcription because it shows which pitch occurs at what time over 8-measure segments. I can extract a time-interval series from these series by moving stepwise along time and finding and holding subsequent differences. The time-interval series for “Biddy’s wedding” appears like so:


The step of 5 semitones for series 2-4 come from the G pitch at the end of each line. I am making the first interval of the first series always be zero.

Here is the transcription I found at the center of a multidimensional scaling of the collection of transcriptions, jig #134 (“Young Tim Murphy”):

Screen Shot 2020-03-13 at 16.01.47.png

And here is its time-pitch series:


The two parts appear quite different save for the last two measures. Here is the time-interval series I extract from this:


In terms of intervals, we see the two parts are similar in measure 4 as well.

One difference between the two jigs above is the anacrusis. This in effect shifts to the right each series of #134 with respect to those of #201. If I am comparing only the series extracted from one transcription, there’s no problem since they all have the same shift. But if I want to compare series across transcriptions, some with an anacrusis and others without, I need to account for the shifts, i.e., align the measures. This will be important to consider when looking at tunes as sequences of measures.

The music21 library provides an easy way to detect an anacrusis, so I have rewritten my feature extraction code such that all series are aligned by measure. Let’s continue looking at the time-pitch series of the collection.

The dots of jig #17 (“The eavesdropper”) are:

Screen Shot 2020-03-27 at 12.39.51.png

and results in the following time-pitch series:


Note that middle C is pitch 60, but I have transposed all jigs in this collection to have a root of C. Here is the time-interval series I extract from the time-pitch series:

One major difference in extracting the time-interval series from the time-pitch series as to how I was doing it before is that this new approach considers repeated pitches as one. So the run of B quavers in the first measure are grouped together in an interval of 4 semitones over 3 quavers. I think this is preferable from the standpoint of considering melody. Playing 3 quavers in place of a dotted crotchet does not change the melody other than its rhythmic characteristic.

(This motivates extracting a “time-duration” series from a transcription to describes its rhythmic characteristics. Instead looking at what pitch is playing when, look at what duration is playing when. Ignoring graces, rolls, and trills, the collection has only pitches of seven durations. From shortest to longest these are triplet semiquaver, semiquaver, triplet quaver, quaver, dotted quaver, crotchet, and dotted crotchet. I will explore this additional feature at a later time… but keep in mind that the features I am extracting are not exemplary of how these tunes are experienced in performance. These are just the bones of the tune as it was in someone’s hand in the early 20th century, without any meat, flesh or movement.)

In the time-pitch series for “The eavesdropper”, we also see how its B part departs from the A part by going higher in pitch, and then descends back to join it. A typical feature of two-part jigs in this collection is that the B part sits above the A part in pitch. To get an idea of how typical it is, let us sum the set of differences between time-pitch series 3 and 1, and of 4 and 2 for each two-part jig in the collection (N=291), and make a histogram of them:

Screen Shot 2020-04-02 at 12.03.28.png

A positive difference means part B of a tune spends more time at pitches higher than part A. I find 268 of the 291 two-part jigs (>92%) have a positive difference. The two-part jig that has the largest difference is #190 (“O’Mahony’s frolics”):

Screen Shot 2020-04-02 at 12.18.09.png

Here are its time-pitch series:

Notice how the first ending of the B part stays high, and the second ending takes the melody down back home.

Of the 23 two-part jigs with a negative difference, the most negative one is #57 (“The blazing turf fire”):

Screen Shot 2020-04-02 at 12.20.04.png

Here are its time-pitch series:


What happens in jigs with more than two parts? Here’s the time-pitch series of the four-part jig #286 (“Strop the razor (2nd setting)”):


We see the melody goes highest in penultimate part (series 5&6). we see the same in the three-part jig #320 (“The piper’s welcome”):

This is not the case in the three-part jig #344 (“The stolen purse”):


Another interesting feature I see in some tunes is contrary motion of the parts, e.g., jig #223 (“The rambler from Clare”):

Screen Shot 2020-04-02 at 13.52.29.png

The time-pitch series show this “mirror image” effect:


This is probably not an accidental feature, but done consciously or planned in composition. Jig #237 (“The Fardown farmer”) has the same kind of construction:


Here are its dotsScreen Shot 2020-04-02 at 14.43.42.png

The A part of this jig and the A part of “The rambler from Clare” are so similar it makes me wonder if the Fardown farmer was the that rambler from Clare

Other tunes have similar intervalic motion in their parts. Here’s jig #249 (“The flitch of Bacon”):Screen Shot 2020-04-02 at 14.51.22.png

And here’s the corresponding time-pitch series

This also shows how I disregard rests in my extraction of the time-pitch series, just extending the duration of the pitch preceding it.






An analysis of the 365 double jigs in O’Neill’s, pt. 8

This is part 8 of my live blogging analysis of the 365 double jigs in O’Neill’s 1001. It’s time for a breather. Let’s have a review!

  1. Part 1 discusses O’Neill’s collection of jigs, and how I have normalized the transcriptions expressed with ABC notation. I use the normalized Damerau-Levenshtein distance (DL distance) to compare the transcriptions as strings, which locates some “duplicates” and variations, as well as several errors in the transcriptions. I find that the normalized DL distance provides sensible results.
  2. Part 2 looks at the similarity matrix created from the normalized DL distance between all pairs of transcriptions. I analyze some of the pairs that have very large distances. I also perform some multidimensional scaling of the collection with the similarity matrix and look at the transcriptions that are at the center of the cluster. Finally, I observe that applying string edit distances to ABC notation is musically naive, e.g., “DEFG2G” in C major and “DEFG2G” in C minor are different.
  3. Part 3 reduces the transcriptions to sequences of measure tokens and looks at the different measure structures present in the collection. This uncovers more errors in the transcriptions, and leads to further normalization of the collection. Performing multidimensional scaling on the reduced sequences creates sensible clusters.
  4. Part 4 converts each transcription into “time-interval series”, which describes the intervalic “profile” of the melody. I explore other series derived from this representation by integration, circular autocorrelation, and marginalization (integrating out time). It is clear that the transcriptions in this collection have a well-defined structure having sections of eight measures, which motivates comparisons of features extracted from these sections, and smaller subsections of 1, 2 and 4 measures.
  5. Part 5 inspects several 8-measure time-interval series in the collection, and gives a broad sense of the intervalic structures of the collection. I also find more transcription errors. I look at transcriptions with time-interval series that have specific statistical characteristics. I also look at the collection as a whole and find some interesting trends, e.g., time spent at pitches arrived to by a perfect fourth up is longer than vice versa.
  6. Part 6 looks at clustering the 1,712 8-measure time-interval series of the collection. I analyze the centroids, and the distributions of distances to these. I transform some centroids to transcription sequences, which do not resemble any of the tunes in the collection. I also begin to inspect the circular autocorrelation of the time-interval series, which I believe are more indicative of the melodic structure in a transcription, e.g., revealing repetitions within a series.
  7. Part 7 looks at clustering the 1,712 circular autocorrelations of the 8-measure time-interval series. I analyze the centroids, which make more musical sense to me than the centroids created from the time-interval series. The structure of a melody is more apparent in these representations, but there are some details that need to be worked out.
  8. Part 8 reviews where we have been, and some questions that remain open. I also look at the sensitivity of a time-pitch series to subtle transformations of the originating transcription.

I have a growing list of open questions:

  1. A multidimensional scaling of the transcriptions according to their normalized DL distances places a few transcriptions closest to the center of the cluster: jigs #134 (“Young Tim Murphy”) and #296 (“Barney O’Neill”). How stable is that position? What is the significance of those transcriptions in that position? What does that position mean musically speaking, if anything? (Perhaps this is not worth investigating given the lack of musical meaning of a string edit distance between ABC transcriptions.)
  2. A number of features have been proposed that express the musical content of a transcription in eight-measure sections: 1) (mean-centered) time-interval series; 2)  (normalized) circular autocorrelation of time-interval series; 3) integral of time-interval series; 4) time-marginalization of time-interval series; 5) histogram of time-marginalization of time-interval series. What about expressing the normalized melody (transposed to root C) as a time-pitch series? What is the musical significance of each of these features?
  3. K-means clustering of the circular autocorrelation of time-interval series shows some sensible results, e.g., finding eight-measure series that are structurally similar. What changes when we perform K-mean clustering on normalized circular autocorrelations (that is, dividing each by the value at zero lag)?
  4. If we break the time-interval series into units of one-measure duration, how many unique units are there? How do they relate? Are there “prototype” measures? Might we see each eight-measure series as a concatenation of these “codebook” units?
  5. My explorations so far show how we can analyze a collection of transcriptions. Can these approaches be used to compare two collections of transcriptions? Say,  O’Neill’s collection with another collection of supposed jigs, say computer-generated, hmm? Hmmmm?

Near the conclusion of the last part, I noticed something that needs more thought. Let’s look at jig #201 (“Biddy’s Wedding”):

Screen Shot 2020-03-27 at 18.08.18.png

This is a very simple tune. Harmonically both parts are: I-I-I-V-I-I-IV-V. The A part is built from two-measure bits like so: abac. The B part is just a variation: a’b’a’c. “Filling in”  crotchet-quaver pairs with passing tones or chord tones, or removing those, do not change the melody. But the time-interval series show these as major changes:

The c part in measures 7&8 is clearly identical. The b and b’ parts appear quite close as well, except for the big long-duration jump of 7 semitones in b’. However, the relationship between the a and a’ parts is not clear. Performing a correlation of these parts of the series would involve a multiplication of a string of zeros, which would reduce its value.

The circular autocorrelation of these time-interval series of this tune suggests its both parts are not closely related:

From looking at the transcription, I expect both parts of this tune to produce large peaks at a lag of 2 and 4 measures, which we see. But the half-measure peaks in the B part (lines 3&4) are curious, as are the small peaks for the A part at some other fraction of  a measure.

Let’s do an experiment to see how robust these features are. I will slightly modify the transcription as below and recompute the time-interval series and its circular autocorrelation:Screen Shot 2020-04-01 at 12.27.21.pngI have added an anacrusis to each part, and have filled in the crotchet-quaver pairs. Here’s the time-interval series for these parts:

Screen Shot 2020-04-01 at 12.29.37.pngThe circular autocorrelation of these are:

Screen Shot 2020-04-01 at 12.31.23.pngThe differences with the original features do not appear to be that great, which is a good sign. I still see that curious structure in the A part.

If we make the arpeggiation of I in measures 2&6 of the A part go downward like so:Screen Shot 2020-04-01 at 12.39.33.png the circular autocorrelation of the time-interval series become more similar:

Screen Shot 2020-04-01 at 12.40.46.pngI don’t think such a minor change to the transcription should result in a major change of high-level features extracted from it. This points to the fact that the time-interval series are too detailed to make meaningful comparisons of melodic structure.

I think I have to return to basics and look at representing the melody as a time-pitch series, and how this might be transformed into a feature that more clearly expresses  structure.

An analysis of the 365 double jigs in O’Neill’s, pt. 7

This is part 7 of my live blogging analysis of of the 365 double jigs in O’Neill’s 1001. Part 1 is here, part 2 is here, part 3 is here, part 4 is here, part 5 is here, and here is part 6.

Now let us look at the results of k-mean clustering of the circular autocorrelations of the 1,712 time-interval series. I start with a single cluster and look at the centroid and the distribution of distances to it. Here is the 145-dimensional centroid:

Screen Shot 2020-03-27 at 17.00.11.png

That looks pretty good. The high value at zero lag suggests this is a sequence with some large time-intervals. The peak at a lag of four suggests that half of the series strongly resembles the other half. The peak at two suggests that the series is built from a two-measure bit. And so on. Let’s look at the distribution of Euclidean distances to this centroid:

Screen Shot 2020-03-27 at 17.08.58.pngThe median of this distribution is around a distance of 104. The largest Euclidean distance we see is about 996, and the smallest is 65. The series furthest from this centroid is in jig #257 (“The Morgan Rattler”), which we keep seeing is a very unique jig in this collection. The series closest to this centroid comes from jig #155 (“Jackson’s rambles”). Here’s its circular autocorrelation:


It looks like part A of this tune contributes the matching series. The dots below show this part has a four-measure structure, and some repetition of intervals at the two-measure level:Screen Shot 2020-03-27 at 17.19.28.png

Here’re the centroids coming from K-means with two clusters:

Screen Shot 2020-03-27 at 17.29.12.pngAnd here are the distance distributions:

Screen Shot 2020-03-27 at 17.33.05.pngThere are 1479 series in cluster 2, but only 233 in cluster 1. The Euclidean distance between the two centroids is about 200.

Let’s try four clusters. Here are the centroids (x-offset is just for display):

Screen Shot 2020-03-27 at 17.38.49.pngNow we can see centroid 1 (population is 209) has to do with time-interval series with similarities at the half-measure-level, centroid 4 (pop. = 98) has to do with time-interval series with similarities at the measure-level, centroid 2 (pop. = 202) has to do with time-interval series with similarities at the two-measure-level, and centroid 3 (pop. = 1102) is perhaps something to do with similarities at the four-measure level.

Here’s eight centroids:

Screen Shot 2020-03-27 at 17.45.40.pngAnd the distances within each cluster.

Screen Shot 2020-03-27 at 17.46.56.png

Cluster 8 is the most populated, with 697 series; but cluster 6 has only 4. Let me guess: those come from “The Morgan Rattler”… Indeed, I see series from #257. But also #154 (“The Antrim lasses”):Screen Shot 2020-03-27 at 17.51.40.pngHere’s the autocorrelation of its time-interval series:


The B part of this tune shows the same structure we see in centroid 6.

There are 48 series in the cluster described by centroid 5 coming from 22 jigs: #6, 18, 23, 30, 56, 71, 82, 117, 125, 126, 127, 172, 178, 183, 186, 201, 204, 258, 274, 287, 291, 343. These should have sequences with repetition at the half measure. Let’s look at two. Jig #18 (“Saddle the pony”):Screen Shot 2020-03-27 at 18.07.30.pngand jig #201 (“Biddy’s wedding”):Screen Shot 2020-03-27 at 18.08.18.pngLooking at the autocorrelation of their time-interval series shows their similarity in this domain (the first is “Saddle the pony”):



Even the other two parts look related! So time-intervalically speaking, we can see why these sections would be grouped together. However, the melodies of these jigs are not very similar.

I have searched the web for people playing these tunes, but there appear to be none! All the performances I can find of “Saddle the pony” are actually the jigs “The Priest’s Leap” (#59) and “The Draught of Ale” (#156) in O’Neill’s 1001 (identical tunes). And “Biddy’s wedding” doesn’t appear to have been recorded anywhere. So learn to play them I have:

Here’s one time through “Saddle the pony” in O’Neill’s 1001 on The Black Box:

Here’s one time through “Biddy’s wedding” from O’Neill’s 1001 (but played in G):

And now I find something curious! “Saddle the pony” appears in O’Neill’s 1850 as two settings, both in A major. The second setting is the one appearing in O’Neill’s 1001, but  with a dropped seventh (A mixolydian):

Screen Shot 2020-03-31 at 11.04.27.png

Why didn’t O’Neill include both settings in his 1001? And where did the G sharp go? I do think the flattened seventh sounds more Irish.

Update: 20200402

My teacher Paudie O’Connor says the G sharps might occur in Donegal, but that the 1001 version plays well as written. There is a four-part jig called “Langstrom’s Pony” that has as its first two parts this version. Here’s De Danann playing the tune:

An analysis of the 365 double jigs in O’Neill’s, pt. 6

This is part 6 of my live blogging analysis of of the 365 double jigs in O’Neill’s 1001. Part 1 is here, part 2 is here, part 3 is here, part 4 is here, and part 5 is here.

Let’s do some clustering of the time-interval profiles in the dataset. I compute these profiles with a sampling rate 6 samples per quaver, over 8 measure sections, which make them have length 6*6*8 = 288. Ima take all 1,712 series in \{-17,\ldots,21\}^{288} and cluster them. All of the time-interval series have this range (and they are all integers since the semitone is the smallest division of the octave in equal temperament). The jump up of 21 semitones (an octave and a major sixth) occurs only in jig #330 (“The queen of the fair”):

Screen Shot 2020-03-26 at 21.16.48.png

The largest leap down of -17 semitones occurs in three jigs, one of which is #36 (“Father Dollard’s favorite”):Screen Shot 2020-03-26 at 21.18.07.png

In fact, 362 of the jigs in this collection of 365 have a section where all intervals lie within [-12,12]. The only three jigs that don’t are #36 above, #13 (“The humors of Bantry”):  Screen Shot 2020-03-26 at 21.45.16.png

and jig #355 (“The lasses of Dunse”):
Screen Shot 2020-03-26 at 21.45.47.png

If we were to treat these series as one cluster in \{-17,\ldots,21\}^{288}, which one lies closest to the centroid? Here’s what the centroid looks like (after projecting it to \{-17,\ldots,21\}^{288} by rounding each dimension):

Screen Shot 2020-03-26 at 21.57.31.pngThis would turn into a rather boring melody, but it is interesting to note that the beginning of the series consists of ascending intervals, and the conclusion is descending intervals to unison.

What is the distribution of distances between all series to this centroid? Below is a histogram of the Manhattan distances of all the series to this centroid:

Screen Shot 2020-03-26 at 22.29.20.pngThe numbers are so large because we are computing differences between intervals six times for every quaver, and there are 48 quavers in an 8-measure section. The series closest to this centroid with a distance of 395 semitones is jig #97 (“The straw seat”)

Screen Shot 2020-03-26 at 22.35.18.png97.png

I can see the resemblance of the first two sections to the centroid.

The jig with the section furthest away is jig #257 (“The Morgan Rattler”) which we saw last time has the largest variance in its time-interval series.

What if we perform K-means with four clusters? Here are the resulting centroids:

Screen Shot 2020-03-27 at 09.02.38.pngThese are more interesting in terms of intervalic content. (The numbering of the centroids is not important.) Let’s convert them into notation to get a better feeling of the melodic content:

Screen Shot 2020-03-27 at 10.10.33.png

None of these resemble in the least a jig. But nothing to fear: below we see the distributions of distances within each of these clusters.

Screen Shot 2020-03-27 at 09.04.10.pngAll the tunes in the collection have time-interval series that are relatively far away from the centroids.

I can increase the number of clusters and see how the centroids and distributions of distances change, but what should I expect? Not “prototype” series that are musically meaningful. If I increase the number of clusters to 40, I begin to see clusters with only a few series in them. Increasing beyond that, the number of clusters consisting of only two series increases. At around 800 centroids, clusters of one series begin appearing.

It doesn’t make sense to cluster 8-measure time-interval series. Breaking the series into single measures and then clustering those smaller units makes more sense to me. As does clustering the circular autocorrelation of the 8-measure time-interval series. Then in some sense we are clustering series based on their time-interval structure, e.g., structures of 2, 4 or 8 measures.

Let’s have a look at some of the 145-dimensional autocorrelations of time-interval series. Here they are for jig #333 (“Miss Downing’s fancy”):


By far the largest peak is at zero. Some smaller peaks around 2 and 3 measures suggests some repetition of features of that length, but no direct repetition. Here’s the dots to see what is going on:Screen Shot 2020-03-27 at 11.50.44.png

We see both parts feature repetitions of some measures, but with variations that makes the structure of each more complex.

On the contrary, here is the circular autocorrelation of the time-interval series for jig #344 (“The stolen purse”):

This suggests parts A and B in this tune are built from four measure sections, but part C is an eight measure section. The dots shows this to be the case:

Screen Shot 2020-03-27 at 11.42.06.png

Here’s another for jig #17 (“The eavesdropper”):


I predict that the first part is built from an intervalic structure of a single-measure length, but the second part has a structure that is four measures length. Here’s the dots confirming that prediction:

Screen Shot 2020-03-27 at 12.39.51.png

Here’s the autocorrelation for jig #56 (“The humors of Cappa”):


Both parts of this jig seem to be built from a half-measure intervalic structures, but the first part more strongly so. The dots show this to be the case, taking into account the anacrusis:

Screen Shot 2020-03-27 at 12.51.50.pngThe autocorrelation for jig #71 (“Courtney’s favorite”) shows quite a difference in the structures of its two parts:


As for the previous jig, this A part is built from repeating a structure of half a measure, and since the size of these is so large, I predict the intervals will be large. The B part has much smaller values, and a structure of two and four measures. Since its values are smaller, I predict the B part has smaller intervals. The dots confirms these predictions:

Screen Shot 2020-03-27 at 13.31.36.png

Jig #73 (“Con Casey’s jig”) shows another interesting structure:


The A part seems to have repetitions of material of one third of a measure. The dots show what is going on.Screen Shot 2020-03-27 at 13.36.59.png

The first measure shows a repetition of two quavers, occurring again in measures three and five.

Having looked through all of these time-interval series autocorrelations, I have a better sense of what the values mean. The value at zero will always be positive, and its value  grows with the size and durations in the time-interval series. The series with the largest value (~363) at lag zero is in jig #257 (“The Morgan Rattler”), which we continue to see is quite a unique one in the collection. The jig with the smallest value (~26) at zero lag is #313 (“The frost is all over”):Screen Shot 2020-03-27 at 13.58.28.png

The B part consists of a lot of stepwise motion and unisons.

A comparison of these autocorrelations with thus be looking at both the structures of the series and the sizes of the intervals. If I normalize each autocorrelation by the value at zero lag, then I will in some sense be comparing structures independent of the size of the intervals. Let’s try clustering by k-means with the autocorrelation and the normalized autocorrelation and see what comes about…

An analysis of the 365 double jigs in O’Neill’s, pt. 5

This here is part 5 of my live blogging an analysis of the 365 double jigs in O’Neill’s 1001. Part 1 is here, part 2 is here, part 3 is here, and part 4 is here. Today I will begin to look more closely at the time-interval series of the tunes in the collection.

I first plot all 1,712 8-measure time-interval series from this collection and just look at them to get a sense of what kinds of structures appear. I see some that look like that of jig #89 (“The boys of the town”):89
The legend refers to the sections: 1 and 2 are the first and second repeats of the A part, and 3 and 4 are the first and second repeats of the B part. To help with readability I have added some slight offsets in x and y.

The first thing that comes to my mind is this:


I loved that gum when I was kid. The first minute of each piece was glorious! That picture makes my mouth water.

Anyhow, the second thing that comes to my mind is the curious delay between the last two sections (red and green lines). Peeking at the underlying transcription shows how this delay arises:Screen Shot 2020-03-24 at 22.37.02.pngAll sections have an anacrusis, but the last measure of the first ending of the B part is a full measure. So the delay we see in the time-interval series comes from a counting mistake. We can correct it simply by removing the B quaver in that last measure. I find about 15 more of these counting mistakes, and so correct them as best I can, reprocess the data, recreate all the features, and plot again.

Let’s have a look at some of the interesting time-interval patterns I see. Here’s the time-interval plot for jig #56 (“The humors of Cappa”):

This shows both parts of the tune share the same intervals in measures 3&4 and 7&8, but do something different in measures 1&2 and 5&6. Here’s the dots confirming that observation:Screen Shot 2020-03-25 at 09.23.03.pngThis kind of repetition results in a clear tune structure, and a strong coherence between the parts. If I were to render this as a poem, it would be:

research blogging, at home, on-line
research blogging, COVID-19
facebook tweeting, at home, on-line
facebook tweeting, COVID-19

Here’s the time-interval series for jig #69 (“Philip O’Neill”):

The two parts to this tune echo the same final two measures, and share a bit of the middle section, but otherwise do different things. Here’s the dots to confirm:

Screen Shot 2020-03-25 at 10.12.44.png

Here’s the time-interval series for jig #101 (“The idle road”):

Both parts the last half, but at otherwise different. Look at all that bouncing up and down! Here’s the dots:Screen Shot 2020-03-25 at 10.23.26.pngI imagine a fiddle player in a horse-drawn cart on a bumpy road. It’s curious that O’Neill has notated broken rhythms explicitly. Perhaps the player from whom he transcribed this exaggerated the jig rhythm there. In this classic recording of the tune, Joe Burke (accordion) ignores that and plays the jig quite evenly with the others following suit:

Jig #148 (“The Kinnegad slashers”) is a three part jig with the following time-interval series:
We see a strong relationship between parts 1&2 (A) and 5&6 (C). The B part does something different until its last four measures. The B part also appears more constrained in its use of large intervals, except for the octave leap in its fourth measure. Here’s the dots to confirm:Screen Shot 2020-03-25 at 10.40.19.pngMy perusal of these time-interval series inspires a few questions.

What tune features a time-interval series that spends most of the time at zero? Apparently there are two: the A part of jig #69 (“Philip O’Neill”):Screen Shot 2020-03-25 at 10.12.44.pngand the B part of jig #331 (“The foot of the mountain”):Screen Shot 2020-03-25 at 13.55.20.pngSorting the series according to the time spent on a zero interval results in the following graph:

Screen Shot 2020-03-25 at 13.58.05.pngI think the height of the stair steps comes from using a sampling rate of 6 samples per quaver. There are apparently several tunes that spend no time at zero intervals. One of these is jig #82 (“Doherty’s fancy”):Screen Shot 2020-03-25 at 14.01.35.pngAnother question to ask is what tune has a time-interval profile with the most positive mean? In other words, which tune spends most of its time at pitches arrived to by positive intervals? It appears to be jig #96 (“Our own little isle”):

Screen Shot 2020-03-25 at 14.27.54.pngThe leap from the D quaver to the g dotted crotchet (an interval of 17 semitones) seems to be contributing a lot to this, even though most of the tune is going downwards.

I find 268 of the jigs in the collection feature a section with a positive mean time-interval profile, and 246 have a section with a negative mean time-interval profile. 110 jigs have a section with a mean time-interval profile exactly equal to zero. One is jig #17 (“The eavesdropper”):Screen Shot 2020-03-25 at 15.21.09.pngAnother question to ask is which tune has a time-interval profile with the smallest variance? That prize goes to the A part of jig #84 (“Wellington’s advance”):

Screen Shot 2020-03-25 at 15.22.31.pngThere are several semitone intervals in the A part. The jig with the largest variance is #257 (“The Morgan Rattler”):Screen Shot 2020-03-13 at 12.01.07.pngIt’s easy to see why that’s the case.

Let’s picture all 1,712 time-interval series in the collection:

Screen Shot 2020-03-25 at 15.05.31.png
Here’s a plot showing this collapsed across the series:

Screen Shot 2020-03-25 at 15.31.12.pngWe can see that the time spent at pitches arrived to by steps of ±2 semitones (major second) is greater than the time spent at pitches arrived to by ±1 semitone (which makes sense because most of the intervals in a scale are 2 semitones, and much of the melodic motion in these melodies is stepwise). We also see that the time spent after steps of -3 (minor third) and -4 (major third) semitones is greater than the time spent after steps of +3 and +4. However, more time is spent after an interval of +5 (perfect fourth) than -5 semitones. Spending time at pitches arrived to by intervals greater than a perfect fifth is rare, but if one is to find themselves at a pitch after an octave leap, expect to spend more time resting after leap up than down.

This look at the collection raises an interesting question: What happens when we break the series into smaller pieces, e.g., units of one-measure length? In that case, we would have at most 13,696 time-interval series of dimension 36. How many unique units are there? How do they relate? Are there “prototype” measures? Might we see each series as a concatenation of these units?