Evaluation Metrics
Classification
| Metric | What it measures | When to use |
|---|
| Accuracy | % correct | Balanced classes only |
| Precision | Of predicted positives, how many are correct | When false positives are costly (spam filter) |
| Recall | Of actual positives, how many were found | When false negatives are costly (disease screening) |
| F1 | Harmonic mean of precision & recall | Imbalanced classes, need balance |
| AUC-ROC | Area under ROC curve | Comparing models, threshold-independent |
Confusion matrix
Predicted
Pos Neg
Actual Pos TP FN
Neg FP TN
Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
F1 = 2 × Precision × Recall / (Precision + Recall)
from sklearn.metrics import classification_report, confusion_matrix
print(classification_report(y_true, y_pred))
print(confusion_matrix(y_true, y_pred))
Regression
| Metric | What it measures | Notes |
|---|
| MSE | Average squared error | Penalizes large errors |
| RMSE | √MSE | Same units as target |
| MAE | Average absolute error | Robust to outliers |
| R² | Proportion of variance explained | 1.0 = perfect, 0 = predicts mean |
Links