Principles of Audio-Rate Frequency Modulation
Prof. Jeffrey Hass, Center for Electronic and Computer Music, Indiana University
[click here to see this article in its own window]

Terms: audio rate, modulation, carrier, modulator, sidebands, reflected sidebands, Bessel function, carrier:modulator ratio, linear (Chowning) or exponential FM.

This article explains the phenomenon of audio-rate frequency modulation of sound, which was explored and used compositionally by John Chowning of Stanford University around 1970. His discoveries eventually lead to the design and release of the Yamaha DX-7 family of instruments, one of the most successful synthesizers of all time.

sub-audio-rate frequency modulation

If the output of an oscillator is applied to control the frequency of another oscillator, frequency modulation (FM) will result. The oscillator providing the control source is referred to as the modulator, the oscillator providing the signal is referred to as the carrier. If the modulating oscillator is tuned below audio-rate (or approximately 20 Hz), sub-audio frequency modulation or vibrato will result. Very simply put, as the modulating waveform rises (increases in amplitude), so too will the frequency of the carrier; as it falls, so too will the frequency of the carrier. The rate of the vibrato is determined by the modulator's frequency, the depth of the vibrato (or how far above and below its center frequency the carrier will be pushed) is determined by the modulator's amplitude and the shape of the vibrato is determined by the modulator's waveform. It is important at the outset of the discussion to realize that the modulator is not part of the signal path--it is never heard directly, only its effect on the carrier frequency.

audio-rate frequency modulation

When the rate of the modulating oscillator is tuned above 20 Hz, or at an audio rate, very interesting things happen to the sound. Additional frequencies called sidebands appear symmetrically around the carrier frequency. Those above the carrier frequency are called upper sidebands and below, lower sidebands. In essence, as will be seen below, some of the energy of the carrier frequency is being stolen to create these additional frequencies.

'Chowning' FM

Both the exact frequencies and the relative strength of the sidebands are predictable using digital technology where all parameters can be precisely controlled. The bulk of our discussion will deal with classic or Chowning FM, named after its greatest proponent. Simple Chowning FM uses the most basic sine wave, which produces no other frequencies apart from its fundamental, as both the carrier and modulating waveform. Indeed, one of the beauties of Chowning FM is its ability to do so much from two very simple waves. The other qualification of Chowning FM is that the modulation be linear, whereby the carrier is pushed an equal number of cycles per second above and below its center frequency. Exponential FM, where the carrier is pushed up and down an equal musical interval (therefore more Hz up than down) drifts upwards in its pitch axis as the modulation depth is increased . Linear FM allows the strength of modulation to be increased without the perceived center frequency rising.

calculating sideband frequencies

Sideband frequencies can be calculated with the following formula where Cƒ=carrier frequency, Mƒ=modulator frequency, n=all positive integers including 0:

Cƒ ± nMƒ   [n=0,1,2,3...]

OR (for those with math anxiety)
the carrier frequency (Cƒ) plus and minus all the integer multiples of the modulating frequency (Mƒ)
OR (for those with really serious math anxiety)
Cƒ, Cƒ + Mƒ, Cƒ - Mƒ, Cƒ + 2Mƒ, Cƒ - 2Mƒ, Cƒ + 3Mƒ, Cƒ - 3Mƒ, etc. to infinity and beyond

For example, a carrier frequency of 400 and a modulating frequency of 50 will produce a spectrum for
n=0 of 400 Hz (400 + (0 * 50))
n=1 of 450 Hz (400 + (1 * 50)) and 350 Hz (400 - (1 * 50))
n=2 of 500 Hz (400 + (2 * 50)) and 300 Hz (400 - (2 * 50))

A graph of this example's sideband pairs, related by color, appears below.:

Another way of calculating sidebands, useful when the carrier and modulator are maintaining a constant frequency relationship is through a ratio of carrier to modulating frequency (C:M). For example, a Cƒ of 100 and an Mƒ of 200 would produce a C:M ratio of 1:2 (click here to see how to reduce C:M ratios to their "normal form"). We'll see below that integer ratios that have a carrier value of 1 have certain properties. We could calculate the upper sidebands for this ratio in relation to the carrier frequency as C+M, C+2M, C+3M, C+4M, etc. For our 1:2 example, the first upper sideband would be 1+2=3, the second would be 1 + (2 * 2)=5. If you worked this out in Hz, you would quickly come to the conclusion that these are the odd numbered partials of the carrier frequency. We calculate the lower sidebands similarly as C-M, C-2M, C-3M, C-4M or in our 1:2 example, -1, -3, -5, etc.

reflected sidebands

What to do with these negative values? Using our two methods of calculating sidebands, say we had a carrier ƒ of 200 Hz and a modulator ƒ of 400 Hz -- that would give us our 1:2 C:M ratio. If we calculate the n=1 pair, we get an upper sideband of 600, but a lower sideband of -200 using our first method or a relative frequency of -1 to the carrier. These sidebands in the negative domain are called reflected sidebands because they bounce back from zero at their absolute value 180 degrees out of phase with their sideband partners. So for both methods of calculating frequency, we simply remove the minus sign, expressed mathematically as absolute or /-200/. If these frequencies do not bounce back on top of other frequencies, then the phase reversal is inaudible. However, as is particularly true in harmonic spectra, when they do bounce back on top of other partials, phase cancellation or summation has a great impact on the timbre. For example, if you had a positive sideband at 400 Hz and a negative sideband at 400 Hz but half the strength of the positive one, only half the amplitude of the positive one will survive. If both were at equal strength, neither would be heard since they would completely cancel each other out. If they were both positive or both negative, they would be summed. In our example above of a Cƒ=200 Hz and Mƒ=400 Hz, the lower sideband of the n=1 pair would reflect back on the carrier frequency (n=0), or /C-M/=C (/-1/=1). Who will survive will be a mystery to be solved below when we can calculate the relative strength of each.

harmonic vs. inharmonic spectra and finding the fundamental frequency

If C and M are both integers (N), a ratio of 1:N will be harmonic but missing the partial numbers which are multiples of N, as in our 1:2 example above, which was missing all the even-numbered partials. Theoretically, any C-to-M ratio that is reducible to integers will produce sidebands that can be seen as harmonically related. If either the carrier or modulator frequency is an irrational number, then the spectrum will be inharmonic. Some integer ratios are very close to irrational, such as Chowning's favorite 1:1.31 or 100:131 as integers. The result for the listener, who will not be able to fuse the sound into a harmonic one, will for practical purposes be inharmonic. The nature of these inharmonic spectra, which have at least twice the frequency components of the harmonic spectra with no phase cancellation, give FM synthesis a wide palette of bright, vibrant timbres, including many bell-like possibilities. Many of these inharmonic spectra can have sidebands that reflect close to, but not on top of existing sidebands, providing the opportunity for shimmering, chorusing-type effects with certain ratios. Below you can see that the reflected sidebands do not reflect to positions midway between the non-reflected sidebands, thereby creating an inharmonic spectrum. A little further tweaking of the C:M ratio below could put these reflected sidebands closer to, but not directly on top of the non-reflected ones, creating a chorusing effect.

For harmonic spectra, there will usually be an implied fundamental frequency, though as we will see below, it may not always be audible. The carrier frequency is not necessarily the fundamental frequency. For the carrier to be the fundamental, M must be greater than or equal to 2 * C, or else be a 1:1 ratio. If, using the ratio method, C and M are integers that have no common factors (i.e. they have been reduced to their lowest form, 2:4 ->1:2), then the fundamental frequency will be the carrier frequency(in Hz)/C which should also equal the modulating frequency(in Hz)/M (for example, 100 Hz/1 or 200 Hz/2 will both give the fundamental of a 100 Hz:200 Hz or 1:2 C:M ratio).

computing the sideband strengths • the modulation index (I)

As with ordinary complex waveforms, the timbre perceived by the listener is determined not only by the frequencies present, but also their relative strengths. The upper and lower sideband of each sideband pair has the same strength. In order to calculate the strength of each sideband pair relative to the others, we must first look at the prime factor which determines it. As we have seen above, when the carrier is modulated, its frequency rises and falls with the amplitude of the modulating wave. The greater the amplitudes of the modulating wave are at its peaks, the greater the maximum distance the carrier is pushed off its center frequency. At sub-audio rate, we would perceive this as the depth of a vibrato. When using a pure sine wave and linear modulation, these peaks will be an equal number of Hz above and below the carrier's center frequency. The number of cycles above or below the center frequency is called the peak deviation (or p.d., or delta () ƒ). As the amplitude of the modulating wave is increased or decreased by some means, perhaps using an envelope generator, so too does the peak deviation change. It is this parameter, the changing strength of the modulating wave, that allows us to create dynamic, time-varying spectra of a sort very different from subtractive filtering and one that can, under certain circumstances, mimic the complexity of real-world sound characteristics using only two oscillators.

To compute how the strength of the sideband pairs change over time as the strength of the modulating wave is varied, we divide the peak deviation by the modulating frequency to produce a value called the modulation index or simply I.

If none of the modulating wave is permitted to reach the carrier, the peak deviation and the value of I will be zero, since no modulation will be taking place. As the amplitude of the modulating wave increases, the carrier is pushed farther and farther off its center frequency and the value of I also increases. The effect of an increasing I is different for each sideband pair. In our first formula for predicting sideband frequencies, we used all the integer values of n (Cƒ ± nMƒ). A sideband pair calculated with a particular value of n can be call a pair of the nth order.

As I increases, each sideband pair follows its own path of increasing and decreasing strength called a Bessel function. The Bessel function curves followed are different for each of the n-order sidebands--one of the things that makes frequency modulation so interesting. (To trig students, these are called Bessel functions of the first kind of order n; to non-trig students, it's more like Close Encounters of the Third Kind.) Below is a graph of the first seven orders (beginning with 0) of sideband pairs, showing their relative strength on the vertical axis as I increases on the horizontal axis.

Note that at I = 0 (i.e. no modulation), the carrier (red, n=0) is at full strength. As I increases, several things happen. Firstly, the carrier loses strength, and secondly, each additional order of sideband pairs begins to be heard one by one. A good rule of thumb for predicting how many sideband pairs (n) will be audible for a given value of I is: n = I+ 1 with I being rounded off to the nearest integer. Each pair, after its initial peak, will decrease in strength and for a given value of I be inaudible as it crosses zero. On the negative side of zero, it will be in reverse phase to any similar frequency on the positive side of zero.

Here are two examples of the spectra produced for fixed values of I, computed by simply looking at the vertical example lines above. The first value of I is relatively low, so only a few sidebands are audible.

The second example shows a higher value of I, which also includes some negative strengths.

In general, as I increases, we can infer that more and more frequencies become audible. This can be a real problem for digital synthesis, where the upper sidebands may reach the Nyquist frequency (see digital audio) and alias. FM is not band-limited. For this reason, most digital synthesis will have a limit on the maximum value of I.

What happens to those mysterious lower sidebands that reflect at their absolute value 180 degrees out of phase? Well, take their order's Bessel function above and invert it. If its strength would normally be in the positive domain, it will be of equal value in the negative domain. This adds greatly to the interest of a dynamically changing harmonic spectrum where sidebands are likely to foldback on top of other sidebands because of the increased complexity of phase cancellation and summation.

Below is the same mod index as example 2, but with a carrier frequency plotted low enough so that the lower sidebands, starting with the n=4 pair reflect back on top of existing sidebands. In this case, the lower n=2 (green) and n=4 (purple) will sum, n=1(blue) and n=5 (orange) will just about cancel each other out and n=0 (red) and n=6 (dark green) will fight it out, but the stronger n=0 will be heard, but reduced by the value of n=6. The lower n=3 (yellow) plots out at 0 Hz and so is not heard at all.

two FM audio examples

Here are two audio examples. The first has a C:M ratio of 1:2 creating a harmonic spectrum, the second a C:M ratio of 1:1.31 creating an inharmonic spectrum. In both examples, the modulation index is increased slowly over 10 seconds from 0 (no modulation) to 15. Play these several times and focus on different frequencies as they move through their Bessel functions. Start with the carrier frequency (the first frequency you will hear). Listen as it immediately begins to lose strength, completely disappears and then reappears. Listen to the effects of phase cancellation in the first harmonic example, which will have far fewer discreet frequencies than the inharmonic one.

Click here to play harmonic example (C:M = 1:2, carrier frequency = 200 Hz)

Click here to play the inharmonic example (C:M = 1:1.31, carrier frequency=150 Hz

some variations on FM

Many things can be done to create more complex spectra with FM. The DX-7 was built around the idea of both double-carrier FM, in which a single modulator controls two carriers, tuned differently. This allows the creation of formant areas not possible with single FM. Also, stacks of modulators, where a modulator was itself modulated, could either produce wildly complex spectra if tuned inharmonically or produce weighted spectra, which could create a more realist bass. This helped with one of FM's greatest drawbacks--the strength of the upper and lower sidebands are equal, but our human hearing requirings more energy in the lower frequencies to be considered as equally loud as the higher frequencies. Therefore, single FM always seemed weighted to the treble, particularly at higher values of I. Another interesting idea is to modulate the modulation index itself, providing a rapid timbral shift. or to low-frequency modulate the modulator or carrier, changing the C:M ratio and therefore the frequencies of the sidebands for some very nice effects.

suggested listening examples

To hear audio-rate FM used with a high level of artistry, there can be no better source than the works of John Chowning himself. Highly recommended are Stria (1976), Turenas (1972) and PhonČ (1981). Barry Truax was another pioneering FM composer with Arras, Androgyny, Wave Edge, Solar Ellipse, and Sonic Landscape No. 3.

suggested reading

J. Chowning, "The Synthesis of Complex Audio Spectra by Means of Frequency Modulation," Journal of the Audio Engineering Society 21(7), 1973; reprinted in Computer Music
Journal 1(2), 1977; reprinted in Foundations of Computer Music, C. Roads and J. Strawn (eds.). MIT Press, 1985.
B. Truax, "Organizational Techniques for C:M Ratios in Frequency Modulation", Computer Music Journal, 1(4), 1978, pp. 39-45; reprinted in Foundations of Computer Music, C. Roads and J. Strawn (eds.). MIT Press, 1985.
C. Dodge and T. Jerse, Computer Music, 2nd Ed., Schirmer Books, 1997.
F. Richard Moore, Elements of Computer Music, Prentice Hall, 1990.

finding the C:M ratio normal form

The concept of the normal form for a C:M ratio has been used for a long time. It is useful for predicting which C:M ratios will produce the same sidebands, but it is not useful for predicting their relative strengths or phases. If the value of M in a ratio is less than twice the value of C, it is not in normal form, but can be reduced to normal form by applying the operation: C = /C - M/. What this means is that you subtract M from C (ignoring any minus sign) and treat the result as the new C value. You keep doing this (often several times) until the ratio satisfies the normal form criterion.
For example, take the C:M ratio of 3:2. Take 3 - 2 and get 1. That is the new value of C (keep the old value of M), so the new ratio will be 1:2. How is this possible--how can 3:2 produce the same sidebands as 1:2? Let's try it out with 300:200 Hz as our 3:2 ratio and 100:200 Hz as our 1:2 ratio.

  3:2 sidebands 1:2 sidebands
n=0 300 100
n=1 100, 500 /-100/, 300
n=2 /-100/, 700 /-300/, 500
n=3 /-300/, 900 /-500/, 700

So you can see they produce the same frequencies, but with sidebands of different orders and different reflections. Therefore, the way these frequencies react to changing values of I will be completely different. But some interesting things can be deduced using normal form. A C:M ratio is in normal form when the carrier is the fundamental in the spectrum it produces, as in our 1:2 example above -- 100 Hz is the fundamental. Harmonic normal form ratios are always of the type 1:N where N is an integer, and inharmonic ones aren't. For a much more detailed treatment of normal form, visit Barry Truax's page at:

Return to CECM Home Page.

This document is prepared and maintained by the Indiana University School of Music
Center for Electronic and Computer Music
Prof. Jeffrey Hass
Last updated: 10 November 2001
†Copyright 1995-2001, Jeffrey Hass and The Trustees of Indiana University