Loss Functions

What

A function that measures how wrong the model’s predictions are. Training = minimizing this function.

Regression losses

LossFormulaProperties
MSE (Mean Squared Error)mean((y - ŷ)²)Penalizes large errors heavily, sensitive to outliers
MAE (Mean Absolute Error)mean(|y - ŷ|)Robust to outliers, not differentiable at 0
HuberMSE for small errors, MAE for largeBest of both worlds

Classification losses

LossFormulaWhen
Binary cross-entropy-[y·log(ŷ) + (1-y)·log(1-ŷ)]Binary classification
Categorical cross-entropy-Σ yᵢ·log(ŷᵢ)Multi-class classification
Hinge lossmax(0, 1 - y·ŷ)SVM

How to choose

  • Regression with outliers → Huber or MAE
  • Regression standard → MSE
  • Classification → cross-entropy (almost always)
  • The loss function encodes your definition of “wrong”

Connection to MLE

MSE = MLE assuming Gaussian noise. Cross-entropy = MLE for classification. See Maximum Likelihood Estimation.