## Background

This is an article about octave band, the prefferd frequency bands used in acoustics as well as some common weight filters. The article is mainly written for people interested in acoustics or people that are interested in how sound is perceived. There is very little code in this article, so of you are just looking for the octave band implementation you could just download the code and ignore the text.

## Introduction

Before I'm going to explain what an octave band is, I will have to go back to the piano. More specifically the tempered tuning of it, you might know that they used a different tuning in the old day, but I'm going to ignore that here, and focus on the current tuning rules.

The piano and its corresponding notes are as follows:

The first A above the middle C is normally tuned at 440 Hz (442 Hz are also used, and was reportedly used by Beethoven and Mozart). You might notice that after 7 white keys and 5 black ones, in all 12 keys, the sequence repeats both upwards and downwards. The trick in the tuning is that the next A note is tuned at 880 Hz, two times the original, and its the 8th note from the original A, or an octave higher (Octave means 8).

The notes in between are equally spaced on a modern piano hence, it is called equal temperament where the notes are arranged by the formula, where N gives the number of half steps from the middle A tune (440 Hz):

$f_N = 440 \cdot 2^{\frac{N}{12}}$

This means that the A# is \(440 \cdot 2^{\frac{1}{12}}\), B is \(440 \cdot 2^{\frac{2}{12}}\) and so on, so the frequency space between them of \(2^{\frac{1}{12}} = 1.0595\) is called a half step. The first correct description of these frequencies was given by Mersenne (the same man that has the Mersenne prime called after him) in his article "Harmonie Universelle" in 1636 and was a tremendously popular and influential publication on music theory. Mersenne is actually considered by many as the father of acoustics. I should perhaps also mention that Vincenzo Galilei (The father of Galileo Galilei) had published a similar value for the equal tempered tuning, but it was not as precise. Vincenzo is also a legend in his own way as he is perhaps the first person in history to describe a non-linear physical property mathematically.

The usage of the equal tempered tuning is somewhat debated, as one is not entirely sure whether or not Johann Sebastian Bach used it or not, but what is certain is that his "Well-tempered clavier" compositions in 1722 was tremendously influential in order to get the system of equal tempered accepted and used.

Before the tempered tuning, the norm was to use a system based on the knowledge of harmonies first investigated by Pythagoras of Samo. He found out that two notes between certain intervals was especially pleasing to listen to. Theses intervals was the ratios 2:1 (octave), 3:2 (perfect fifth), 4:3 (perfect fourth) and 5:4 (major third). So the tuning was based on getting these intervals between notes, that meant that one had to have *just intonation * as each of the intervals can be expressed with an integer fraction. This is the case with Pytageroean tuning (discovered by Ling Lun 1974).

Key
| Tempered tuning | Pythagorean intervals
| Name | Just intonation |

C#/Db | 2^1/12 = 1.0595 | 256/243 = 1.0535 | Minor second | |

D | 2^2/12 = 1.1225 | 9/8 = 1.1250 | Major second | 9/8 |

D#/Eb | 2^3/12 = 1.1892 | 32/27 = 1.1852 | Minor third | |

E | 2^4/12 = 1.2599 | 81/64 = 1.2656 | Major third | 5/4 |

F | 2^5/12 = 1.3333 | 4/3 = 1.3333 | Forth | 4/3 |

F#/Gb | 2^6/12 = 1.4142 | 729/512 = 1.4238 | Argumented Fourth | |

G | 2^7/12 = 1.4893 | 3/2 = 1.5000 | Fifth | 3/2 |

G#/Ab | 2^8/12 = 1.5874 | 128/81 = 1.5802 | Minor sixth | |

A | 2^9/12 = 1.6818 | 27/16 = 1.6875 | Major sixth | 5/3 |

A#/Bb/Hb | 2^10/12 = 1.7818 | 16/9 = 1.7778 | Minor seventh | |

H/B | 2^11/12 = 1.8877 | 243/128 = 1.8984 | Major seventh | 15/8 |

C | 2^12/12 = 2 | 2/1 = 2 | Octave | 2 |

If one compares the Pythagorean scale with the tempered scale, one finds that it fits rather well, but have the additional benefit of beeing easy to transform into other majors. The values used in the just intonation explains why its called tempered, they are slightly different (i.a. they have been tempered with) from the just intonation. More information on tuning can be found here.

Given the musical history behind the octave name, one starts to understand that in acoustic the name octave is now synonymous with doubling of frequency.

## The Octave band filters

The story behind the frequency band is in reality, a story of preferred numbers, first introduced by Charles Renard. Charles was asked by the army to reduce the number of ropes needed by the balloon ships, that at the time counted 425 individual sized ropes. He came up with a geometric series on the following format:

$R5_N = 10^{N/5}$

The original series was later named R5, but there are also additional standard series of R10 and R20 and so on. This series is very convenient to use as it has a 10 base system. It is easy to calculate with and they can easily be combined into any number you like very efficiently, the reason beeing the spaces between numbers, and it si shown very neatly at this site. And indeed, they became a great success, reducing the number of ropes needed from 425 to 17. The series is named Renard series in his honor.

Historically speaking, like shown with the piano, the number 2 base was far more popular due to the fact that they didn't have the decimal system, meaning only whole numbers to describe a ratio. So we start off by assuming that we have a center frequency named \(f_C\), that will have a corresponding bandwidth, i.a. size where it lets frequencies pass through based on the lowest pass freqency \(f_L\) and highest pass frequency \(f_H\). We have two demands for this, the first:

$f_C = \sqrt{f_L \cdot f_H}$

and that the bandwidth is:

$f_H = 2 \cdot f_L$

And since the bandwidth is equal to 2, these are called octave band filters. We can define the lower and upper bandwidth frequencies a little different, where N is the octave fraction:

$f_L = f_C \cdot 2^{-1/(2*N))}$

and upper limit:

$f_H = f_C \cdot 2^{1/(2*N))}$

The reason we like the N'th fraction octave band is that one can create a lower order octave band from the higher octave values by simply adding the values in them together.

In acoustics these values weren't really used, due to the fact of the happy numerical accident:

$2^{\frac{10}{3}} = 10.079$

This meant that the octave band series would have all the 10 based numbers in the series, like; 1,10,100,1000,10000,100000. Later the 2 times was exchanged for a 10 base directly, this meant a slight change the way the lower and upper frequency bands are defined in the octave band bandwidth:

$f_L = f_C \cdot 10^{-3/(20*N))}$

and upper limit:

$f_H = f_C \cdot 10^{3/(20*N))}$

But the bandwidth still remains 2 as the previous 2 base octave band definitions were:

$\frac{f_H}{f_L} = 10^{\frac{3}{10}} = 1.995 \approx 2$

Selection of the center frequencies is given by choosing a reference frequency (f_R) and construct the center frequency series from it. In acoustics, the value 1000 is chosen as the reference. Given the fractional value N, ranging from 1 to any rational number, the frequencies are calculated differently by odd and even fraction numbers:

$f_{m}(x) = f_R \cdot 10^{3x/(10 \cdot N)} \text{ if N is odd}$

$f_{m}(x) = f_R \cdot 10^{3(2x+1)/(20 \cdot N)} \text{ if N is even}$

The function that will make all this happen is quite straight forward:

List<double> Result = new List<double>();
double fr = 1000;
bool IsEven = (Band % 2 == 0);
for (int i = -10 * Band; i < 10 * (Band); i++)
{
double t = 0;
if (!IsEven)
{
t = (3 * (double)i / (10 * (double)Band));
}
else
{
t = (3 * (2 * (double)i + 1) / (20 * (double)Band));
}
double fm = fr * Math.Pow(10, t);
double fH = fm * Math.Pow(10, 3 / (20 + (double)Band));
double fL = fm * Math.Pow(10, -3 / (20 * (double)Band));
if ((fL >= 20 && fL <= 20000))
Result.Add(fm);
}

The number of frequency components is always 10 times the fractional bandwidth number. The distribution is a little hard to calculate in advance. The reason is that the bandwidth have a range from 20 to 20 000 Hz and frequencies that are not in it will not be equally distributed above and below the reference frequency of 1000 Hz. In order to filter out signals that are outside the auditory range, I used the lower frequency band to ignore any band that has a band pass below 20 Hz, and also any signal that starts above 20 kHz. This will print out the correct values that we are interested in.

The formula works well, but the normalized values of 1/1 and 1/3 octave bands are a little different than the calculated values. Since the two-octave bands is used frequently to calculate other values, like sound insulation, RC curves etc, they are precalculated so that plotting of the values looks nicer.

## Perceived loudness

Nearly all natural phenomena is non-linear in nature and so is the human ear and its method of perceiving loudness. Early research in sound levels started to develop in the 1920's given that it was the first time you could use electric microphones to record it. At the Bell research laboratory Harvey Fletcher and his associates quickly found out that the humans could hear a vast array of sound levels ranging from the audible \(20 \mu Pa\) at 1000 Hz up to as high as 20 M Pa before it became unbearably painful to listen to, a numerical range of 10^12 numbers. To make it more intuitive and readable for humans, they started to use something called a Bell curve, it is just a 10 base logarithmic table that uses a referance pressure of \(20 \mu\) Pascal. To give it a more fine tuning they multiplied the value with 10 (a deci), hence the name deciBell dB of sound pressure levels. The sound pressure is i the pressure squared so it is often just written as 20*log10(rms sound pressure).

The experiments were done by playing a single sine wave with different amplitudes, registering at which amplitude when the lowest audible sound was heard. You would think that the starting point would be set at 0 phon but it is in fact not, the lowest audible noise is defined to be 3 phon, which is very confusing. A doubling of the perceived sound level would be 10 phon higher, which is a shortcoming of the measurement, as one would have liked 2 phons to be a doubling of the perceived noise level. In any case, the measurements were done at several different frequencies so that a tone level curve for perceived sound levels could be drafted (From Wikipedia):

The Fletcher.Monson is slightly different than the modern ISO 226 standard, but they generally describe the same curve. What is so important about this curve? It shows you that you will adjust the frequency amplification by just adjusting the volume on your music player and this has a wide rangeing consequences. If you want to find out how noisy it is in your appartment, it would make sence to weight the frequency components measured levels as we simply dont hear equally good in the audiable range. These curves are, as one woul dexpect, based on the hearign of a normal person, people with hearing disabilities would have a very different resons to different levels. People with hearing disabilities usually have a more narrow band where they can hear sound, so the changes in perception calso changes much faster. This is typically true for the mid frequency range around about 2000 Hz.

It is quite tedious to classify the phon levels, so they came up with filters that adjusted and summeraized the sound levels according to a phon curve. They approximated the phon curve 40 and called it A-weight sound level, and used the phon 100 cure to create a C-weight filter. There is also a B-weight filter, which is between the A and C weight, but is not-much-used anymore. A D-weight also exists that was used to measure loud aircraft noise.

The code for A weight filter is:

public double AWeight(double f)
{
return 20 * Math.Log10(RA(f)) - 20*Math.Log10(RA(1000));
}
double RA(double f)
{
double a1 = Math.Pow(f, 2) + Math.Pow(20.6, 2);
double a2 = Math.Sqrt((Math.Pow(f, 2) + Math.Pow(107.7, 2))*(Math.Pow(f,2)+Math.Pow(737.9,2)));
double a3 = Math.Pow(f, 2) + Math.Pow(12200, 2);
return (Math.Pow(12200, 2) * Math.Pow(f, 4))/(a1*a2*a3);
}

Code for C weight is:

public double CWeight(double f)
{
return 20 * Math.Log10(RC(f)) - 20 * Math.Log10(RC(1000));
}
double RC(double f)
{
return Math.Pow(12200,2)*Math.Pow(f,2)/((Math.Pow(f,2)+Math.Pow(20.6,2))*(Math.Pow(f,2)+Math.Pow(12200,2)));
}

And B and D curves that aren't really used today are given in the code for completness. As you can see the A and C weight curves is approximated for medium and high noise levels respectivly. There are however some problems with then, for starters they use pure sinusodial tones to measure the precieved noise levels and as the reasueches found in the 60 and 70's the A-weight cure (medium high noise or normal sound level) would change slightly if one used a filetered white noise instead. The result of this reasearch was the ITU-R 468 noise weighting curve (from Wikipedia): The code to generate the curve:

public double ITU(double freq)
{
return 18.2 + 20 * Math.Log10(RITU(freq));
}
double RITU(double freq)
{
return (1.246332637532143 * Math.Pow(10, -4) * freq / (Math.Sqrt(Math.Pow(h1(freq), 2) + Math.Pow(h2(freq), 2))));
}
double h1(double freq)
{
return -4.737338981378384 * Math.Pow(10, -24) * Math.Pow(freq, 6) + 2.043828333606125*Math.Pow(10,-15)*Math.Pow(freq,4)-1.363894795463638*Math.Pow(10,-7)*Math.Pow(freq,2)+1;
}
double h2(double freq)
{
return 1.306612257412824 * Math.Pow(10, -19) * Math.Pow(freq, 5) - 2.118150887518656 * Math.Pow(10, -11) * Math.Pow(freq, 3) + 5.559488023498642 * Math.Pow(10, -4) * freq;
}

When you use these filters with an octave band filtered signal, the properly defined weight sound pressure level, dB(A), might be different than the sum of octave filtered and weighted level.

All of these weighting curves have 1 fundamental flaw, they don't account for what is called sound masking, often called auditory masking. This is the effect the is utilized in the MP3 formats where sound at a frequency is played at a certain sound level, making it impossible to hear adjacent frequencies played unless they are above a certain level. There is also something called spatial masking, where a burst of sound makes other noise inaudible for a time period after the burst.

## References

In addition to the numerous Wikipedia articles given in the text, there are two books where a lot of the background information comes from:

- "Handbook of signal processing in acoustics Vol. 1"
- "Acoustics- An introduction to its physical principles and applications" Allan Pierce