Visualization with Yellowbrick and Scikit-learn

Yellowbrick is a powerful visualization library that integrates seamlessly with Scikit-learn to provide diagnostic and interpretability visualizations for machine learning models. It extends Scikit-learn’s capabilities by offering model visualizers that work directly with Scikit-learn estimators.

Key Characteristics

  • Built on top of Matplotlib and Scikit-learn
  • Provides visualizers for classification, regression, and clustering
  • Easy integration with Scikit-learn Pipelines
  • Interactive and interpretable plots like ROC curves, classification reports, and residual plots

Basic Rules

  • Install via pip install yellowbrick
  • Use .fit() and .score() methods like Scikit-learn estimators
  • Visualizers can be used in cross-validation
  • Compatible with both Pipeline and standalone models

Syntax Table

SL NO Task Syntax Example Description
1 Install Yellowbrick pip install yellowbrick Installs the library
2 Import Visualizer from yellowbrick.classifier import ROCAUC Imports a specific model visualizer
3 Create Visualizer viz = ROCAUC(model) Initializes the visualizer
4 Fit Visualizer viz.fit(X_train, y_train) Trains the model and prepares for visualization
5 Display Plot viz.show() Displays the visual output

Syntax Explanation

1. Install Yellowbrick

What is it?
Installs the Yellowbrick library via pip.

Syntax:

pip install yellowbrick

Explanation:

  • Required once to download the library from PyPI.
  • Must be installed before importing any Yellowbrick visualizers.
  • Can be installed in Jupyter with !pip install yellowbrick.

2. Import Visualizer

What is it?
Loads a specific visualization tool from Yellowbrick.

Syntax:

from yellowbrick.classifier import ROCAUC

Explanation:

  • Imports the ROC AUC visualizer for binary classification models.
  • Yellowbrick offers various modules like classifier, regressor, and cluster.
  • You can import multiple visualizers together for comprehensive analysis.

3. Create Visualizer

What is it?
Initializes the visualizer object with a Scikit-learn model.

Syntax:

viz = ROCAUC(LogisticRegression())

Explanation:

  • Binds the estimator with the visualizer class.
  • Accepts any Scikit-learn estimator compatible with the visualizer (e.g., classifiers for ROC AUC).
  • Parameters like micro, macro, or per-class curves can be added for customization.
  • Enables advanced settings like color, alpha transparency, or classes.

4. Fit Visualizer

What is it?
Fits the model to training data and generates intermediate visual data.

Syntax:

viz.fit(X_train, y_train)

Explanation:

  • Calls the underlying estimator’s .fit() method.
  • Prepares the model for scoring and visualization.
  • Captures and stores performance metrics during fitting.
  • Can be used before .score() or directly followed by .show().

5. Display Plot

What is it?
Renders the visualization on screen.

Syntax:

viz.show()

Explanation:

  • Calls matplotlib.pyplot.show() behind the scenes.
  • Renders plots in Jupyter, Python scripts, or standalone Python apps.
  • If in a Jupyter notebook, use %matplotlib inline for inline rendering.
  • Can also save figures using viz.poof(outpath='plot.png').

Real-Life Project: ROC Curve Visualization

Project Overview

Visualize ROC curve of a Logistic Regression classifier trained on a binary classification dataset.

Code Example

from yellowbrick.classifier import ROCAUC
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer

# Load data
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Visualizer
viz = ROCAUC(LogisticRegression(max_iter=1000))
viz.fit(X_train, y_train)
viz.score(X_test, y_test)
viz.show()

Expected Output

  • ROC curve with AUC score displayed
  • Colored decision threshold curve with label separation

Common Mistakes to Avoid

  • ❌ Not calling .fit() before .score() or .show()
  • ❌ Using models incompatible with visualizer type
  • ❌ Forgetting to install Yellowbrick before import

Further Reading Recommendation

📘 Hands-On Python and Scikit-Learn: A Practical Guide to Machine Learning by Sarful Hassan

🔗 Available on Amazon