Feature Importance Analysis in Scikit-learn

Feature importance analysis helps identify which input features have the most influence on a model’s predictions. This is crucial for interpretability, feature selection, and improving model performance. Scikit-learn offers multiple ways to compute feature importance, depending on the model type.

Key Characteristics

Provides insight into model behavior
Useful for feature selection and dimensionality reduction
Supported by tree-based models, linear models, and permutation methods
Can be visualized for better interpretability

Basic Rules

Use model-specific .feature_importances_ for tree-based models
Use .coef_ for linear models (after scaling)
Apply permutation_importance() for model-agnostic insights
Normalize or scale data for linear models to get accurate importances

Syntax Table

SL NO	Technique	Syntax Example	Description
1	Tree-based Importance	`model.feature_importances_`	Returns importance scores for each feature
2	Linear Model Coefficients	`model.coef_`	Coefficients representing feature weights
3	Permutation Importance	`permutation_importance(model, X, y)`	Model-agnostic importance scores
4	Visualizing Importance	`plt.barh(range(len(importances)), importances)`	Plots the importance scores
5	Sorting Importances	`np.argsort(importances)[::-1]`	Ranks features from most to least important

Syntax Explanation

1. Tree-based Feature Importance

What is it?
Extracts feature importance directly from tree-based models like RandomForest or GradientBoosting.

Syntax:

model.feature_importances_

Explanation:

Returns an array of importance scores (summing to 1).
Measures the average reduction in impurity brought by each feature.
Works with RandomForestClassifier, GradientBoostingClassifier, etc.

2. Linear Model Coefficients

What is it?
Uses the absolute magnitude of coefficients as a proxy for feature importance.

Syntax:

model.coef_

Explanation:

Must scale features before interpretation (e.g., using StandardScaler).
Positive/negative values indicate direction of influence.
Suitable for LogisticRegression, Ridge, Lasso, etc.

3. Permutation Importance

What is it?
Measures decrease in model performance when each feature is randomly shuffled.

Syntax:

from sklearn.inspection import permutation_importance
results = permutation_importance(model, X_test, y_test)

Explanation:

Model-agnostic; works with any estimator.
Requires a fitted model and evaluation data.
Results include importances_mean and importances_std.

4. Visualizing Importance

What is it?
Plots feature importances using a horizontal bar chart.

Syntax:

import matplotlib.pyplot as plt
plt.barh(range(len(importances)), importances)

Explanation:

Provides a clear view of feature rankings.
Combine with argsort to order features.
Useful in presentations and model explainability.

5. Sorting Importances

What is it?
Ranks feature indices based on importance.

Syntax:

import numpy as np
sorted_idx = np.argsort(importances)[::-1]

Explanation:

Helps list top-N important features.
Can be used to reorder plots or reduce feature space.

Real-Life Project: Customer Churn Prediction

Project Overview

Identify key drivers of customer churn using feature importance from a Random Forest model.

Code Example

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
import matplotlib.pyplot as plt
import numpy as np

# Generate synthetic data
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

# Train model
model = RandomForestClassifier()
model.fit(X_train, y_train)

# Get feature importances
importances = model.feature_importances_
sorted_idx = np.argsort(importances)

# Plot
plt.barh(range(len(importances)), importances[sorted_idx])
plt.yticks(range(len(importances)), [f"Feature {i}" for i in sorted_idx])
plt.xlabel("Importance")
plt.title("Feature Importance")
plt.show()

Expected Output

Bar chart showing most to least important features
Insight into which features affect churn decisions most

Common Mistakes to Avoid

❌ Interpreting unscaled coefficients from linear models
❌ Assuming correlation = importance
❌ Ignoring permutation variance in small datasets

📘 Hands-On Python and Scikit-Learn: A Practical Guide to Machine Learning by Sarful Hassan

🔗 Available on Amazon

Key Characteristics

Basic Rules

Syntax Table

Syntax Explanation

1. Tree-based Feature Importance

2. Linear Model Coefficients

3. Permutation Importance

4. Visualizing Importance

5. Sorting Importances

Real-Life Project: Customer Churn Prediction

Project Overview

Code Example

Expected Output

Common Mistakes to Avoid

Further Reading Recommendation

📘 Hands-On Python and Scikit-Learn: A Practical Guide to Machine Learning by Sarful Hassan

Login