What is Supervised Learning?
Supervised learning algorithms learn a mapping from input features to an output label using a labeled dataset. They are widely used for classification and regression tasks.
Key Concepts
- Training set: Data used to fit the model.
- Test set: Unseen data to evaluate performance.
- Overfitting: Model captures noise instead of the underlying pattern.
- Cross‑validation: Technique to assess model generalization.
Example: Iris Classification
We will train a classifier to predict the species of iris flowers using the famous Iris dataset.
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
# Load data
iris = datasets.load_iris()
X, y = iris.data, iris.target
# Split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42, stratify=y)
# Scale
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Train
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)
# Predict
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))