ML Blog

Gradient Boosting Explained

Table of Contents

What is Gradient Boosting?

Gradient Boosting is an ensemble technique that builds models sequentially, each trying to correct the errors of its predecessor. By fitting new models to the negative gradient of the loss function, it optimizes the overall predictive performance.

How It Works

The core idea is simple:

  1. Start with an initial prediction (often the mean for regression or log‑odds for classification).
  2. Compute the residuals (the gradient of the loss).
  3. Fit a weak learner (usually a shallow decision tree) to these residuals.
  4. Update the ensemble by adding the new learner, scaled by a learning rate.
  5. Repeat for a fixed number of iterations or until convergence.

Algorithm Pseudocode

F0(x) = argmin_γ Σ L(yi,γ)
for m = 1 to M:
    r_i = -[∂L(yi, F_{m-1}(xi)) / ∂F_{m-1}(xi)]
    h_m(x) = FitTree(x, r)
    γ_m = argmin_γ Σ L(yi, F_{m-1}(xi) + γ h_m(xi))
    F_m(x) = F_{m-1}(x) + η·γ_m·h_m(x)
return F_M(x)

Example & Visualization

Below is a simple demonstration of how loss decreases over iterations.

Practical Tips