Evaluation Metrics

Classification

MetricWhat it measuresWhen to use
Accuracy% correctBalanced classes only
PrecisionOf predicted positives, how many are correctWhen false positives are costly (spam filter)
RecallOf actual positives, how many were foundWhen false negatives are costly (disease screening)
F1Harmonic mean of precision & recallImbalanced classes, need balance
AUC-ROCArea under ROC curveComparing models, threshold-independent

Confusion matrix

                Predicted
              Pos    Neg
Actual  Pos   TP     FN
        Neg   FP     TN

Precision = TP / (TP + FP)
Recall    = TP / (TP + FN)
F1        = 2 × Precision × Recall / (Precision + Recall)
from sklearn.metrics import classification_report, confusion_matrix
 
print(classification_report(y_true, y_pred))
print(confusion_matrix(y_true, y_pred))

Regression

MetricWhat it measuresNotes
MSEAverage squared errorPenalizes large errors
RMSE√MSESame units as target
MAEAverage absolute errorRobust to outliers
Proportion of variance explained1.0 = perfect, 0 = predicts mean