Handling Missing Values – Imputation
from sklearn.impute import SimpleImputer
import numpy as np
X = np.array([[1, 2], [np.nan, 3], [7, np.nan]])
# Mean imputation
mean_imp = SimpleImputer(strategy='mean')
X_mean = mean_imp.fit_transform(X)
# Median imputation
median_imp = SimpleImputer(strategy='median')
X_median = median_imp.fit_transform(X)
print("Mean Imputed:\n", X_mean)
print("Median Imputed:\n", X_median)
Imputation replaces missing values (np.nan
) with statistical estimates. SimpleImputer
supports strategies such as mean, median, most_frequent, and constant. The choice depends on data distribution and model sensitivity.
Live Demo: Mean & Median Imputation
Feature 1 | Feature 2 |
---|---|