Convolutional Neural Networks

What

Neural nets designed for grid-structured data (images). Use small learnable filters that slide across the input to detect patterns.

Key ideas

Convolution layer

A small filter (e.g., 3×3) slides across the image, computing a dot product at each position → produces a feature map.

Early layers detect edges, textures
Deeper layers detect shapes, objects

Pooling layer

Downsample feature maps (e.g., 2×2 max pooling → halve spatial dimensions). Reduces computation and adds translation invariance.

Architecture pattern

[Conv → ReLU → Pool] × N → Flatten → [FC → ReLU] × M → Output

In PyTorch

import torch.nn as nn
 
class SimpleCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size=3, padding=1),   # 3 channels in, 32 out
            nn.ReLU(),
            nn.MaxPool2d(2),                               # halve spatial dims
            nn.Conv2d(32, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2),
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(64 * 8 * 8, 128),
            nn.ReLU(),
            nn.Linear(128, 10),
        )
 
    def forward(self, x):
        x = self.features(x)
        return self.classifier(x)

Famous architectures (historical progression)

Year	Model	Innovation
2012	AlexNet	Deep CNNs on GPU, ReLU, dropout
2014	VGG	Very deep, small 3×3 filters
2015	ResNet	Skip connections → 100+ layers
2017	EfficientNet	Balanced scaling of depth/width/resolution

In practice: use Transfer Learning

Don’t train CNNs from scratch. Use a pretrained model (ResNet, EfficientNet) and fine-tune.

AI/ML Notes

Explorer

Convolutional Neural Networks

Convolutional Neural Networks

What

Key ideas

Convolution layer

Pooling layer

Architecture pattern

In PyTorch

Famous architectures (historical progression)

In practice: use Transfer Learning

Links

Graph View

Table of Contents

Backlinks