Multi-Class Classification Strategies in Scikit-learn

Multi-class classification is a supervised learning task where the goal is to assign each input sample to one of three or more classes. Scikit-learn provides several strategies to handle multi-class problems, including One-vs-Rest (OvR), One-vs-One (OvO), and native multiclass classifiers like RandomForestClassifier or LogisticRegression.

Key Characteristics

Handles more than two class labels
Supports One-vs-Rest (OvR) and One-vs-One (OvO) strategies
Can use native classifiers or meta-estimators
Works with both linear and nonlinear models

Basic Rules

Choose OvR for high efficiency on large datasets
OvO may work better with models sensitive to class boundaries
Evaluate confusion matrix to understand per-class performance
Use stratified train-test split to ensure balanced class distribution

Syntax Table

SL NO	Technique	Syntax Example	Description
1	One-vs-Rest	`OneVsRestClassifier(LogisticRegression())`	Trains one classifier per class vs all others
2	One-vs-One	`OneVsOneClassifier(SVC())`	Trains one classifier per class pair
3	Native Support	`RandomForestClassifier()`	Native support for multi-class
4	Fit Model	`model.fit(X_train, y_train)`	Trains the chosen classifier
5	Predict Classes	`model.predict(X_test)`	Returns predicted class labels

Syntax Explanation

1. One-vs-Rest (OvR)

What is it?
A strategy that fits one classifier per class, where each classifier distinguishes a class from all others.

Syntax:

from sklearn.multiclass import OneVsRestClassifier
from sklearn.linear_model import LogisticRegression
model = OneVsRestClassifier(LogisticRegression())

Explanation:

Suitable for linear models
Efficient on large datasets
Each classifier outputs a confidence score; highest score wins

2. One-vs-One (OvO)

What is it?
A strategy that fits one classifier per class pair.

Syntax:

from sklearn.multiclass import OneVsOneClassifier
from sklearn.svm import SVC
model = OneVsOneClassifier(SVC())

Explanation:

Builds N(N-1)/2 classifiers for N classes
Each classifier votes, and majority class wins
Effective when class boundaries are complex

3. Native Multi-Class Classifier

What is it?
Classifiers like Random Forest and Logistic Regression inherently support multi-class classification.

Syntax:

from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()

Explanation:

No need to wrap with OvR or OvO
Handles class imbalance and non-linearity well
Straightforward integration

4. Fit Model

What is it?
Trains the selected model on the labeled dataset.

Syntax:

model.fit(X_train, y_train)

Explanation:

Accepts feature matrix and label vector
Learns the decision boundaries between classes
Can be combined with grid search or pipelines

5. Predict Classes

What is it?
Predicts the class labels for unseen test data.

Syntax:

predictions = model.predict(X_test)

Explanation:

Produces an array of predicted class labels
Useful for accuracy, confusion matrix, or F1 score evaluations
Can be used in real-time prediction systems

Real-Life Project: Digit Recognition (MNIST)

Project Overview

Use multi-class classification strategies to identify handwritten digits (0–9) from images.

Code Example

from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report

# Load data
digits = load_digits()
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size=0.3, random_state=42)

# Train model with native multi-class support
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
print(classification_report(y_test, y_pred))

Expected Output

Per-class precision, recall, and F1-scores
Overall accuracy of multi-class classifier

Common Mistakes to Avoid

❌ Ignoring label distribution in train/test splits
❌ Using binary classifiers without wrapping them in OvR or OvO
❌ Not evaluating model using class-specific metrics

📘 Hands-On Python and Scikit-Learn: A Practical Guide to Machine Learning by Sarful Hassan

🔗 Available on Amazon

Key Characteristics

Basic Rules

Syntax Table

Syntax Explanation

1. One-vs-Rest (OvR)

2. One-vs-One (OvO)

3. Native Multi-Class Classifier

4. Fit Model

5. Predict Classes

Real-Life Project: Digit Recognition (MNIST)

Project Overview

Code Example

Expected Output

Common Mistakes to Avoid

Further Reading Recommendation

📘 Hands-On Python and Scikit-Learn: A Practical Guide to Machine Learning by Sarful Hassan

Login