The quiet revolution
in how machines think.
For decades the CPU sat at the center of computing. A second processor, born for graphics, has rewritten that center of gravity. This is the field guide.
One core, in a hurry.
Many cores, in unison.
Both processors are computing the same 2048×2048 matrix multiplication. The CPU walks the grid one tile at a time. The GPU lights every tile at once. Press play.
Seven sections. Pick where to start.
Each section is its own deep dive. The explainer is the gentlest entry point; the hardware taxonomy and benchmarks chart are the most opinionated.
What is accelerated computing?
Four questions, answered honestly: what it is, why a CPU is not enough, whether it is just GPUs, and when it stops being worth it.
Anatomy of an accelerator
An interactive diagram of host CPU, PCIe, the on-device scheduler, streaming multiprocessors, HBM, and L2 cache.
Where it shows up
Six fields that look different now than they did a decade ago: AI training, simulation, graphics, genomics, finance, and edge inference.
GPU vs TPU vs NPU vs FPGA
A side-by-side taxonomy of the four accelerator families, ranked on flexibility, throughput, and power.
The lines that diverge
Eighteen years of FP16 throughput on CPUs and GPUs. The CPU has roughly tripled. The accelerator has improved by three orders of magnitude.
Twenty-five years, in seven beats
From programmable shaders to NPUs in every laptop and phone — the history of how the second processor took center stage.
The terms
The vocabulary an engineer uses to talk about accelerated systems, kept short and operational. CUDA, HBM, MFU, tensor cores, systolic arrays, and the rest.