ImageNet Classification with Deep CNNs (AlexNet)
Krizhevsky, Sutskever, Hinton (2012)
Why It Matters
Won ILSVRC-2012 by massive margin. Proved deep CNNs on GPUs could outperform hand-engineered features. Triggered the deep learning revolution.
Key Ideas
- Show that deep convolutional networks trained on large labeled datasets with GPUs can drastically outperform hand-engineered vision systems.
- Combine ReLUs, dropout, data augmentation, and GPU training into a recipe that made large-scale deep vision practical.
- Demonstrate that scale in model size, data, and compute can unlock representations traditional pipelines could not learn.
- Mark the turning point where deep learning became the default path in computer vision.
Notes
- AlexNet was both an algorithmic and systems breakthrough.
- Its historical importance is less about the exact architecture and more about proving the deep-learning recipe at ImageNet scale.