Autoencoders

What

Neural net trained to compress input → small representation (bottleneck) → reconstruct the input. The bottleneck forces the model to learn the most important features.

input → encoder → latent space (bottleneck) → decoder → reconstructed input

Loss = reconstruction error (how different is the output from the input).

Variants

Vanilla Autoencoder

Basic compression. Bottleneck forces dimensionality reduction.

Variational Autoencoder (VAE)

Latent space is a probability distribution, not just a vector. Can generate new data by sampling from the distribution.

Denoising Autoencoder

Add noise to input, train to reconstruct clean version → learns robust features.

Use cases

  • Dimensionality reduction: alternative to PCA (can capture nonlinear structure)
  • Anomaly detection: train on normal data, anomalies have high reconstruction error
  • Generation: VAEs generate new data (images, molecules)
  • Denoising: remove noise from images/signals
  • Pretraining: learn representations, then fine-tune for a task