Gradient Boosting

Gradient Boosting is a powerful ensemble technique that builds models sequentially, each trying to correct the errors of its predecessor. It works by fitting a new model to the residual errors of the combined ensemble of previous models.

When to use Gradient Boosting

Tabular data with a mix of numeric and categorical features
When high predictive accuracy is required
Regression or classification problems

Key Hyper‑parameters

n_estimators – number of boosting rounds
learning_rate – shrinkage factor (0 < lr ≤ 1)
max_depth – depth of each weak learner (tree)
subsample – fraction of samples used for each tree (helps reduce over‑fitting)

Python Example

from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import mean_squared_error

X, y = load_boston(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

gbr = GradientBoostingRegressor(
    n_estimators=200,
    learning_rate=0.1,
    max_depth=3,
    subsample=0.9,
    random_state=42
)

gbr.fit(X_train, y_train)
pred = gbr.predict(X_test)
print("RMSE:", mean_squared_error(y_test, pred, squared=False))

Interactive Loss Curve Demo

Learning Rate 0.10

Estimators 200

Python Data Science & Machine Learning

Gradient Boosting

When to use Gradient Boosting

Key Hyper‑parameters

Python Example

Interactive Loss Curve Demo

Further Reading