Polynomial Regression
What
Linear regression with polynomial features. Fit curves instead of lines.
ŷ = w₁x + w₂x² + w₃x³ + b
Still “linear” in the weights — just nonlinear in the features.
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline
model = make_pipeline(PolynomialFeatures(degree=3), LinearRegression())
model.fit(X_train, y_train)When to use
- Residual plots from linear regression show a clear curve — the relationship isn’t linear
- Low-dimensional data (1-3 features). With many features, polynomial expansion creates too many terms
- You want an interpretable model — coefficients still have meaning (unlike tree-based models)
- Quick baseline before trying more complex nonlinear methods
Interaction terms
PolynomialFeatures also creates interaction terms like x₁ * x₂. This captures how features combine, not just individual nonlinearity.
# interaction_only=True skips pure powers (x², x³), keeps only cross terms
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree=2, interaction_only=True)
# [x1, x2] → [1, x1, x2, x1*x2]Watch out
- High degree → overfitting (fits training noise)
- Features explode: degree 3 with 10 features → 286 features
- Use Regularization (Ridge/Lasso) with polynomial features
Polynomial regression vs decision trees
| Aspect | Polynomial regression | Decision trees |
|---|---|---|
| Nonlinearity | Smooth curves | Step functions |
| Extrapolation | Wild swings outside training range | Flat (predicts last seen value) |
| Interpretability | Coefficients have meaning | Visual tree splits |
| Feature scaling | Needed (features have different magnitudes) | Not needed |
| Best for | Smooth, continuous relationships | Complex interactions, categorical data |
Alternatives
- Splines: piecewise polynomials joined at knots. Smoother than high-degree polynomials, less prone to oscillation at boundaries
- Kernel methods: project data into high-dimensional space implicitly (see Support Vector Machines). No explicit feature expansion
- GAMs (Generalized Additive Models): fit a smooth function per feature, then add them up. Interpretable and flexible
Links
- Linear Regression
- Regularization
- Bias-Variance Tradeoff
- Decision Trees — alternative for nonlinear patterns