Data Augmentation

What

Create modified copies of training data to increase dataset size and variety. Reduces overfitting by forcing the model to be invariant to transformations.

Image augmentation

from torchvision import transforms
 
transform = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(15),
    transforms.ColorJitter(brightness=0.2, contrast=0.2),
    transforms.RandomResizedCrop(224, scale=(0.8, 1.0)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                        std=[0.229, 0.224, 0.225]),
])

Common image augmentations: flip, rotate, crop, scale, color jitter, blur, cutout/mixup.

Text augmentation

  • Synonym replacement
  • Random insertion/deletion/swap
  • Back-translation (translate to another language and back)
  • Paraphrasing with LLMs

Key rules

  • Only augment training data, never validation/test
  • Augmentations should be plausible (don’t flip digits upside down)
  • More augmentation helps more when you have less data
  • Combined with Transfer Learning, enables training on very small datasets