Data Augmentation
What
Create modified copies of training data to increase dataset size and variety. Reduces overfitting by forcing the model to be invariant to transformations.
Image augmentation
from torchvision import transforms
transform = transforms.Compose([
transforms.RandomHorizontalFlip(),
transforms.RandomRotation(15),
transforms.ColorJitter(brightness=0.2, contrast=0.2),
transforms.RandomResizedCrop(224, scale=(0.8, 1.0)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
])Common image augmentations: flip, rotate, crop, scale, color jitter, blur, cutout/mixup.
Text augmentation
- Synonym replacement
- Random insertion/deletion/swap
- Back-translation (translate to another language and back)
- Paraphrasing with LLMs
Key rules
- Only augment training data, never validation/test
- Augmentations should be plausible (don’t flip digits upside down)
- More augmentation helps more when you have less data
- Combined with Transfer Learning, enables training on very small datasets
Links
- Transfer Learning
- Regularization — augmentation is a form of regularization
- Convolutional Neural Networks