Denoising Diffusion Probabilistic Models
Ho, Jain, Abbeel (2020)
Why It Matters
Iterative denoising through learned reverse diffusion. Foundation of Stable Diffusion, DALL-E 2, Midjourney.
Key Ideas
- Define generation as reversing a gradual noising process rather than sampling all structure in one shot.
- Train the model to predict and remove noise, turning generation into a sequence of denoising steps.
- Trade fast sampling for strong sample quality and stable likelihood-based training.
- Establish the core recipe that later diffusion image, audio, and video models scaled up.
Notes
- The key insight is that hard generation can be decomposed into many easier denoising problems.
- Later diffusion work mostly improved speed, conditioning, and scale on top of this framework.