What the FFT Actually Shows You
Every sound you hear is a sum of sine waves. The Fast Fourier Transform decomposes that sum. Here's what that means, how it works, and why a 60-year-old algorithm is still everywhere.
The Question
Play a chord on a piano — say, C and E together. Your ear hears one sound. But that sound is two frequencies superimposed: 261.6 Hz and 329.6 Hz. Your cochlea physically separates them — different hair cells resonate at different frequencies, sending distinct signals to your brain.
The Fast Fourier Transform does the same thing, but with numbers instead of hair cells. Give it a signal (a sequence of amplitude samples over time) and it returns a list of frequencies and their strengths. It answers: what frequencies are present, and how much of each?
What's Actually Happening
A signal sampled over time is a list of numbers: the amplitude at each sample point. A 1-second recording at 44,100 Hz is 44,100 numbers. These numbers describe the signal in the time domain — amplitude as a function of time.
The FFT converts this to the frequency domain — amplitude as a function of frequency. Same information, different representation. Like switching between Cartesian and polar coordinates: nothing is created or destroyed, only re-expressed.
The mathematical core: every periodic signal can be written as a sum of sine and cosine waves at different frequencies. This is Fourier's theorem (1807). The FFT computes the coefficients of that sum — how much of each frequency is in the signal.
Why "Fast"
The naive way to compute a Fourier transform requires N² operations for N samples. For 1024 samples, that's about 1 million operations. The Cooley-Tukey algorithm (1965) reduces this to N·log₂(N) — about 10,000 operations for the same input. A 100x speedup. For a million samples, the speedup is 50,000x.
The trick: split the N-point transform into two N/2-point transforms, recursively. This requires N to be a power of 2 (or you pad with zeros). Each split halves the problem. The "butterfly" operation combines the halves:
X[k] = Even[k] + W · Odd[k]
X[k+N/2] = Even[k] - W · Odd[k]
Where W is a complex exponential (a rotation in the complex plane). The same two sub-results give you two output points. This is why the algorithm is "fast" — it reuses every computation twice.
PinePaper's implementation is a textbook Cooley-Tukey radix-2 DIT (decimation in time). 40 lines of JavaScript. We wrote it from scratch rather than importing a library because we wanted students to be able to read the source and understand every line.
What Those Bars Mean
When you see a spectrum analyzer — bars jumping to music — each bar represents a frequency bin. The height is the magnitude (strength) of that frequency in the current signal.
- A pure sine wave produces one tall bar at its frequency and nothing else.
- A square wave produces bars at the fundamental and every odd harmonic (3rd, 5th, 7th...), decreasing as 1/n. This is why square waves sound "buzzy" — they contain high-frequency energy that pure sines don't.
- White noise produces bars of roughly equal height everywhere. Every frequency is present with equal probability.
- A human voice produces a fundamental (the pitch you hear) plus formants — resonant peaks from the shape of your vocal tract that distinguish vowels.
Windowing: Why the Edges Matter
There's a catch. The FFT assumes the signal repeats forever. But our sample is finite — it starts and stops. If the signal doesn't happen to be at zero at both endpoints, the abrupt cutoff creates artificial high-frequency content. This is called spectral leakage.
The fix: multiply the signal by a window function that tapers smoothly to zero at the edges. Common windows:
- Hann (cosine bell): good general purpose, loses some frequency resolution
- Hamming: similar to Hann but doesn't reach zero at edges, slightly better sidelobe suppression
- Blackman: narrower main lobe, better sidelobe suppression, loses more frequency resolution
The choice is always a tradeoff between frequency resolution (how precisely you can identify a frequency) and spectral leakage (how much energy bleeds into neighboring bins). There is no perfect window. This is a consequence of the uncertainty principle — you cannot have arbitrarily precise knowledge of both time and frequency simultaneously.
Where the FFT Lives
You interact with FFT results constantly:
- MP3 and AAC compression: transform audio to frequency domain, discard frequencies below the hearing threshold, compress what remains. The transform is the entire basis of lossy audio compression.
- JPEG compression: the 2D version (DCT) transforms 8×8 pixel blocks to frequency domain, quantizes high-frequency components. That's why JPEG artifacts appear as blocks.
- WiFi and 5G: OFDM encoding splits data across many frequency sub-carriers. The FFT converts between time-domain transmission and frequency-domain data symbols.
- MRI imaging: the raw signal from an MRI scanner is in frequency space. The inverse FFT reconstructs the spatial image. Literally: every MRI you've ever seen is an inverse Fourier transform.
- Shazam: computes the spectrogram (FFT over sliding windows), extracts peaks, matches the pattern against a database. The FFT is the first step in recognizing every song.
A 60-year-old algorithm, in your pocket, running billions of times per day.
Try It
Open PinePaper, select the Spectrum Analyzer generator. Generate a square wave. Look at the bars — you'll see the odd harmonics falling off as 1/n. Switch to a sawtooth — now all harmonics are present, falling off as 1/n. Switch to noise — flat spectrum, every frequency equally likely.
Change the window function. Watch how Hann smooths the spectrum at the cost of wider peaks. Switch to Blackman — narrower peaks but lower sidelobes.
You're not reading about the FFT. You're measuring signals and observing what the transform reveals. That's the difference between knowing and understanding.
References
- Brigham, E.O. (1988). The Fast Fourier Transform and Its Applications. Prentice Hall.
- Cooley, J.W. & Tukey, J.W. (1965). An Algorithm for the Machine Calculation of Complex Fourier Series. Mathematics of Computation, 19(90), 297-301.
- Fourier, J. (1822). Théorie analytique de la chaleur. Paris: Firmin Didot.
- Harris, F.J. (1978). On the Use of Windows for Harmonic Analysis with the Discrete Fourier Transform. Proceedings of the IEEE, 66(1), 51-83.
- Oppenheim, A.V. & Schafer, R.W. (2009). Discrete-Time Signal Processing (3rd ed.). Prentice Hall.
- Shannon, C.E. (1949). Communication in the Presence of Noise. Proceedings of the IRE, 37(1), 10-21.
- Smith, S.W. (1997). The Scientist and Engineer's Guide to Digital Signal Processing. California Technical Publishing.
- Wang, A., et al. (2003). An Industrial-Strength Audio Search Algorithm. Proceedings of ISMIR 2003. (Shazam's audio fingerprinting algorithm.)
- Wallace, G.K. (1991). The JPEG Still Picture Compression Standard. Communications of the ACM, 34(4), 30-44.
PinePaper's FFT is a Cooley-Tukey radix-2 implementation with Hann, Hamming, and Blackman windowing, plus low-pass and high-pass filters. Try it free at pinepaper.studio/editor.
Ready to create?
Start making animated GIFs, videos, and graphics — free, no signup.
Open PinePaper Editor