Gradient Boosting

What

Build trees sequentially: each new tree corrects the errors of the previous ensemble. Powerful, often wins competitions.

How it works

Start with a simple prediction (e.g., mean)
Compute residuals (errors)
Train a small tree to predict the residuals
Add that tree’s predictions (scaled by learning rate) to the ensemble
Repeat

Each tree fixes what the previous ones got wrong → powerful additive model.

Libraries

Library	Notes
XGBoost	The classic, fast, handles missing values
LightGBM	Faster than XGBoost, good for large data
CatBoost	Best for categorical features, no encoding needed
sklearn GradientBoosting	Slower, but consistent API

from xgboost import XGBClassifier
 
model = XGBClassifier(
    n_estimators=200,
    max_depth=6,
    learning_rate=0.1,
    subsample=0.8,
)
model.fit(X_train, y_train)

Key hyperparameters

n_estimators: number of trees (more = better, up to a point)
learning_rate: how much each tree contributes (lower = needs more trees)
max_depth: depth of each tree (3-8 usually)
subsample: fraction of data per tree (like bagging, reduces overfitting)

Gradient Boosting vs Random Forest

	Random Forest	Gradient Boosting
Trees	Independent, parallel	Sequential, corrective
Overfitting	Hard to overfit	Can overfit if not tuned
Tuning	Works well with defaults	Needs careful tuning
Speed	Fast to train	Slower (sequential)

AI/ML Notes

Explorer

Gradient Boosting

Gradient Boosting

What

How it works

Libraries

Key hyperparameters

Gradient Boosting vs Random Forest

Links

Graph View

Table of Contents

Backlinks