In the construction of harmonic pitch class profiles (HPCP), detailed in Chapter 3 of E. Gómez, “Tonal description of music audio signals,” Ph. D. thesis, Music Technology Group, Univ. Pompeu Fabra, Barcelona, Spain, 2006, I am looking at the estimation of the reference tuning of instruments in a piece of music, which is the pitch everyone tunes to in equal temperament. In the USA this is frequently 440 Hz. In Europe it can be as high as 444 Hz. In Bach’s time, it was 415 Hz! Anyhow, this estimation is important because ultimately we need to fold each overtone series into a pitch class. Gómez describes the process, but in a way that confuses me. So here I take another route.

Given the best world of all possible worlds, the harmonic series of a pitch with fundamental frequency \(f_0\) is given by integer multiples \(n f_0, n = 0, 1, \ldots\). So, given a set of frequencies we find from the energy peaks of a spectrum \(\mathcal{G} = \{g_i : i = 1, 2, \ldots, \}\) the smallest frequency that “explains” most of the others, i.e.,

$$ f_0 = \max_N \{g_j \in \mathcal{G} : n g_j \in \mathcal{G}, n = 1, \ldots, N\}.$$

Given the fundamental frequency of a pitch \(f_0\), and assuming the music is in the 12-tone equal tempered tuning system built upon the tuning frequency \(f_t\), we can describe \(f_0\) by

$$f_0 = f_t 2^{\beta\left(f0; f_t\right )/12}$$

where \(\beta\left(f0; f_t\right )\) describes where this pitch exists “on a piano,” so to say.

Solving for this number, we find

$$\beta\left (f0; f_t\right ) := 12 \log_2 \left (f_0/f_t \right ).$$

Our goal is to find \(f_t\).

For the fundamental frequency of a pitch in an equal tempered system \(\beta\left(f0; f_t\right )\) must be an integer. The reference frequency has no such restriction. Thus, we need only adjust \(f_t\) to make \(\beta\left(f0; f_t\right )\) an integer. There are of course an infinite number of solutions to this problem, and not all of them are good if we are observing only one pitch. This situation can be ameliorated by, e.g., observing multiple pitches, or constraining the tuning frequency to a reasonable domain, e.g., \(f_t \in [410, 460]\) Hz.

However, because we do not live in the best world of all possible worlds, all sorts of non-linearities creep in to screw with the perfect harmonic series to produce *overtones* — which makes life “interesting” for those who stay positive.

This means that it is unlikely we will find perfect integer relationships between a fundamental and its overtones, even for such a perfect instrument as the hurdy gurdy.

But we can still find the fundamental that “explains” most of the others, by using instead

$$ f_0 = \max_N \{g_j \in \mathcal{G} : g \in [ (1-\delta)ng_j, (1+\delta) n g_j ] \in \mathcal{G} , n = 1, \ldots, N \} $$

for some small \(\delta \ge 0\).

There may also be several pitches at the same time; but this is all the better because then we can form multiple estimates of the tuning frequency, and find the mean.

Finally, we can also form running estimates of the tuning frequency over time, with which we can watch an entire baroque ensemble playing on period instruments become more and more out of tune as it plays.

With regards to the method presented in Gómez 2006, unless I am completely wrong, we can avoid assuming most of the overtones of a pitch are in tune with the reference frequency (they aren’t), and building up histograms based on this assumption.