Feature Importance Analysis in Scikit-learn

Feature importance analysis helps identify which input features have the most influence on a model’s predictions. This is crucial for interpretability, feature selection, and improving model performance. Scikit-learn offers multiple ways to compute feature importance, depending on the model type.

Key Characteristics

  • Provides insight into model behavior
  • Useful for feature selection and dimensionality reduction
  • Supported by tree-based models, linear models, and permutation methods
  • Can be visualized for better interpretability

Basic Rules

  • Use model-specific .feature_importances_ for tree-based models
  • Use .coef_ for linear models (after scaling)
  • Apply permutation_importance() for model-agnostic insights
  • Normalize or scale data for linear models to get accurate importances

Syntax Table

SL NO Technique Syntax Example Description
1 Tree-based Importance model.feature_importances_ Returns importance scores for each feature
2 Linear Model Coefficients model.coef_ Coefficients representing feature weights
3 Permutation Importance permutation_importance(model, X, y) Model-agnostic importance scores
4 Visualizing Importance plt.barh(range(len(importances)), importances) Plots the importance scores
5 Sorting Importances np.argsort(importances)[::-1] Ranks features from most to least important

Syntax Explanation

1. Tree-based Feature Importance

What is it?
Extracts feature importance directly from tree-based models like RandomForest or GradientBoosting.

Syntax:

model.feature_importances_

Explanation:

  • Returns an array of importance scores (summing to 1).
  • Measures the average reduction in impurity brought by each feature.
  • Works with RandomForestClassifier, GradientBoostingClassifier, etc.

2. Linear Model Coefficients

What is it?
Uses the absolute magnitude of coefficients as a proxy for feature importance.

Syntax:

model.coef_

Explanation:

  • Must scale features before interpretation (e.g., using StandardScaler).
  • Positive/negative values indicate direction of influence.
  • Suitable for LogisticRegression, Ridge, Lasso, etc.

3. Permutation Importance

What is it?
Measures decrease in model performance when each feature is randomly shuffled.

Syntax:

from sklearn.inspection import permutation_importance
results = permutation_importance(model, X_test, y_test)

Explanation:

  • Model-agnostic; works with any estimator.
  • Requires a fitted model and evaluation data.
  • Results include importances_mean and importances_std.

4. Visualizing Importance

What is it?
Plots feature importances using a horizontal bar chart.

Syntax:

import matplotlib.pyplot as plt
plt.barh(range(len(importances)), importances)

Explanation:

  • Provides a clear view of feature rankings.
  • Combine with argsort to order features.
  • Useful in presentations and model explainability.

5. Sorting Importances

What is it?
Ranks feature indices based on importance.

Syntax:

import numpy as np
sorted_idx = np.argsort(importances)[::-1]

Explanation:

  • Helps list top-N important features.
  • Can be used to reorder plots or reduce feature space.

Real-Life Project: Customer Churn Prediction

Project Overview

Identify key drivers of customer churn using feature importance from a Random Forest model.

Code Example

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
import matplotlib.pyplot as plt
import numpy as np

# Generate synthetic data
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

# Train model
model = RandomForestClassifier()
model.fit(X_train, y_train)

# Get feature importances
importances = model.feature_importances_
sorted_idx = np.argsort(importances)

# Plot
plt.barh(range(len(importances)), importances[sorted_idx])
plt.yticks(range(len(importances)), [f"Feature {i}" for i in sorted_idx])
plt.xlabel("Importance")
plt.title("Feature Importance")
plt.show()

Expected Output

  • Bar chart showing most to least important features
  • Insight into which features affect churn decisions most

Common Mistakes to Avoid

  • ❌ Interpreting unscaled coefficients from linear models
  • ❌ Assuming correlation = importance
  • ❌ Ignoring permutation variance in small datasets

Further Reading Recommendation

📘 Hands-On Python and Scikit-Learn: A Practical Guide to Machine Learning by Sarful Hassan

🔗 Available on Amazon