Classification
Classification is a core task of supervised learning where the goal is to assign discrete labels to input observations. It is used in spam detection, image recognition, medical diagnosis, and many other domains.
Types of Classification
- Binary Classification: Two classes (e.g.,
spamvsnot spam). - Multiclass Classification: More than two classes (e.g., digit recognition 0‑9).
- Multilabel Classification: Each instance can belong to multiple classes simultaneously (e.g., tagging an image with several objects).
Popular Algorithms
| Logistic Regression | Linear model, works well for linearly separable data. |
| K‑Nearest Neighbors (K‑NN) | Instance‑based, non‑parametric. |
| Support Vector Machines (SVM) | Maximizes margin, can use kernels. |
| Decision Trees & Random Forests | Tree‑based, handles non‑linear relationships. |
| Naïve Bayes | Probabilistic, assumes feature independence. |
Python Example – Logistic Regression
import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report
X, y = make_classification(n_samples=1000, n_features=2,
n_informative=2, n_redundant=0,
random_state=42)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.25, random_state=42)
model = LogisticRegression()
model.fit(X_train, y_train)
pred = model.predict(X_test)
print(classification_report(y_test, pred))
Interactive Demo – Decision Boundary
Drag points on the canvas to create two classes and see a simple logistic regressor update the decision boundary in real time.