· 6 min read

What the FFT Actually Shows You

Every sound you hear is a sum of sine waves. The Fast Fourier Transform decomposes that sum. Here's what that means, how it works, and why a 60-year-old algorithm is still everywhere.

The Question

Play a chord on a piano — say, C and E together. Your ear hears one sound. But that sound is two frequencies superimposed: 261.6 Hz and 329.6 Hz. Your cochlea physically separates them — different hair cells resonate at different frequencies, sending distinct signals to your brain.

The Fast Fourier Transform does the same thing, but with numbers instead of hair cells. Give it a signal (a sequence of amplitude samples over time) and it returns a list of frequencies and their strengths. It answers: what frequencies are present, and how much of each?

What's Actually Happening

A signal sampled over time is a list of numbers: the amplitude at each sample point. A 1-second recording at 44,100 Hz is 44,100 numbers. These numbers describe the signal in the time domain — amplitude as a function of time.

The FFT converts this to the frequency domain — amplitude as a function of frequency. Same information, different representation. Like switching between Cartesian and polar coordinates: nothing is created or destroyed, only re-expressed.

The mathematical core: every periodic signal can be written as a sum of sine and cosine waves at different frequencies. This is Fourier's theorem (1807). The FFT computes the coefficients of that sum — how much of each frequency is in the signal.

Why "Fast"

The naive way to compute a Fourier transform requires N² operations for N samples. For 1024 samples, that's about 1 million operations. The Cooley-Tukey algorithm (1965) reduces this to N·log₂(N) — about 10,000 operations for the same input. A 100x speedup. For a million samples, the speedup is 50,000x.

The trick: split the N-point transform into two N/2-point transforms, recursively. This requires N to be a power of 2 (or you pad with zeros). Each split halves the problem. The "butterfly" operation combines the halves:

X[k]     = Even[k] + W · Odd[k]
X[k+N/2] = Even[k] - W · Odd[k]

Where W is a complex exponential (a rotation in the complex plane). The same two sub-results give you two output points. This is why the algorithm is "fast" — it reuses every computation twice.

PinePaper's implementation is a textbook Cooley-Tukey radix-2 DIT (decimation in time). 40 lines of JavaScript. We wrote it from scratch rather than importing a library because we wanted students to be able to read the source and understand every line.

What Those Bars Mean

When you see a spectrum analyzer — bars jumping to music — each bar represents a frequency bin. The height is the magnitude (strength) of that frequency in the current signal.

  • A pure sine wave produces one tall bar at its frequency and nothing else.
  • A square wave produces bars at the fundamental and every odd harmonic (3rd, 5th, 7th...), decreasing as 1/n. This is why square waves sound "buzzy" — they contain high-frequency energy that pure sines don't.
  • White noise produces bars of roughly equal height everywhere. Every frequency is present with equal probability.
  • A human voice produces a fundamental (the pitch you hear) plus formants — resonant peaks from the shape of your vocal tract that distinguish vowels.

Windowing: Why the Edges Matter

There's a catch. The FFT assumes the signal repeats forever. But our sample is finite — it starts and stops. If the signal doesn't happen to be at zero at both endpoints, the abrupt cutoff creates artificial high-frequency content. This is called spectral leakage.

The fix: multiply the signal by a window function that tapers smoothly to zero at the edges. Common windows:

  • Hann (cosine bell): good general purpose, loses some frequency resolution
  • Hamming: similar to Hann but doesn't reach zero at edges, slightly better sidelobe suppression
  • Blackman: narrower main lobe, better sidelobe suppression, loses more frequency resolution

The choice is always a tradeoff between frequency resolution (how precisely you can identify a frequency) and spectral leakage (how much energy bleeds into neighboring bins). There is no perfect window. This is a consequence of the uncertainty principle — you cannot have arbitrarily precise knowledge of both time and frequency simultaneously.

Where the FFT Lives

You interact with FFT results constantly:

  • MP3 and AAC compression: transform audio to frequency domain, discard frequencies below the hearing threshold, compress what remains. The transform is the entire basis of lossy audio compression.
  • JPEG compression: the 2D version (DCT) transforms 8×8 pixel blocks to frequency domain, quantizes high-frequency components. That's why JPEG artifacts appear as blocks.
  • WiFi and 5G: OFDM encoding splits data across many frequency sub-carriers. The FFT converts between time-domain transmission and frequency-domain data symbols.
  • MRI imaging: the raw signal from an MRI scanner is in frequency space. The inverse FFT reconstructs the spatial image. Literally: every MRI you've ever seen is an inverse Fourier transform.
  • Shazam: computes the spectrogram (FFT over sliding windows), extracts peaks, matches the pattern against a database. The FFT is the first step in recognizing every song.

A 60-year-old algorithm, in your pocket, running billions of times per day.

Interactive demo — open in editor pp:PinePaper

Try It

Open PinePaper, select the Spectrum Analyzer generator. Generate a square wave. Look at the bars — you'll see the odd harmonics falling off as 1/n. Switch to a sawtooth — now all harmonics are present, falling off as 1/n. Switch to noise — flat spectrum, every frequency equally likely.

Change the window function. Watch how Hann smooths the spectrum at the cost of wider peaks. Switch to Blackman — narrower peaks but lower sidelobes.

You're not reading about the FFT. You're measuring signals and observing what the transform reveals. That's the difference between knowing and understanding.

References

  • Brigham, E.O. (1988). The Fast Fourier Transform and Its Applications. Prentice Hall.
  • Cooley, J.W. & Tukey, J.W. (1965). An Algorithm for the Machine Calculation of Complex Fourier Series. Mathematics of Computation, 19(90), 297-301.
  • Fourier, J. (1822). Théorie analytique de la chaleur. Paris: Firmin Didot.
  • Harris, F.J. (1978). On the Use of Windows for Harmonic Analysis with the Discrete Fourier Transform. Proceedings of the IEEE, 66(1), 51-83.
  • Oppenheim, A.V. & Schafer, R.W. (2009). Discrete-Time Signal Processing (3rd ed.). Prentice Hall.
  • Shannon, C.E. (1949). Communication in the Presence of Noise. Proceedings of the IRE, 37(1), 10-21.
  • Smith, S.W. (1997). The Scientist and Engineer's Guide to Digital Signal Processing. California Technical Publishing.
  • Wang, A., et al. (2003). An Industrial-Strength Audio Search Algorithm. Proceedings of ISMIR 2003. (Shazam's audio fingerprinting algorithm.)
  • Wallace, G.K. (1991). The JPEG Still Picture Compression Standard. Communications of the ACM, 34(4), 30-44.

PinePaper's FFT is a Cooley-Tukey radix-2 implementation with Hann, Hamming, and Blackman windowing, plus low-pass and high-pass filters. Try it free at pinepaper.studio/editor.

Ready to create?

Start making animated GIFs, videos, and graphics — free, no signup.

Open PinePaper Editor