Linear Regression

What

Predict a number by fitting a line (or hyperplane) through the data.

ŷ = w₁x₁ + w₂x₂ + ... + wₙxₙ + b

Why start here

It’s the simplest model and introduces every core ML concept: loss functions, gradient descent, overfitting, regularization. Understand linear regression deeply and everything else follows.

Training

Minimize MSE: find weights w that minimize mean((y - ŷ)²).

Two approaches:

Normal equation: w = (XᵀX)⁻¹Xᵀy — closed-form, exact, but expensive for large data
Gradient descent: iteratively update weights — scales to any size

from sklearn.linear_model import LinearRegression
 
model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
 
# Coefficients tell you feature importance (if features are scaled)
print(model.coef_)
print(model.intercept_)

Assumptions

Linear relationship between features and target
Features are not highly correlated (multicollinearity)
Errors are normally distributed with constant variance

When assumptions are violated: try Polynomial Regression, tree-based models, or neural nets.

Regularized variants

Ridge (L2): shrinks coefficients, handles multicollinearity
Lasso (L1): shrinks + eliminates features (automatic feature selection)

AI/ML Notes

Explorer

Linear Regression

Linear Regression

What

Why start here

Training

Assumptions

Regularized variants

Links

Graph View

Table of Contents

Backlinks