Feature Scaling
What
Putting features on similar scales so no single feature dominates due to its numeric range.
Methods
| Method | Formula | Range | When to use |
|---|---|---|---|
| StandardScaler | (x - mean) / std | ~[-3, 3] | Default choice, works with most models |
| MinMaxScaler | (x - min) / (max - min) | [0, 1] | Neural nets, when you need bounded values |
| RobustScaler | (x - median) / IQR | varies | When you have outliers |
from sklearn.preprocessing import StandardScaler, MinMaxScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train) # fit on TRAIN only
X_test_scaled = scaler.transform(X_test) # transform test with train paramsWhich models need it?
| Needs scaling | Doesn’t need scaling |
|---|---|
| Linear/logistic regression | Decision trees |
| SVM | Random forests |
| KNN | Gradient boosted trees (XGBoost, LightGBM) |
| Neural networks | Naive Bayes |
| PCA |
Critical rule
Fit the scaler on training data only. Then use those same parameters to transform test data. Fitting on all data = data leakage.