shap

📁 eyadsibai/ltk 📅 Jan 28, 2026

总安装量

周安装量

安装命令

npx skills add https://github.com/eyadsibai/ltk --skill shap

Agent 安装分布

gemini-cli 6

antigravity 5

claude-code 5

github-copilot 5

opencode 4

Skill 文档

SHAP Model Explainability

Explain ML predictions using Shapley values – feature importance and attribution.

When to Use

Explain why a model made specific predictions
Calculate feature importance with attribution
Debug model behavior and validate predictions
Create interpretability plots (waterfall, beeswarm, bar)
Analyze model fairness and bias

Quick Start

import shap
import xgboost as xgb

# Train model
model = xgb.XGBClassifier().fit(X_train, y_train)

# Create explainer
explainer = shap.TreeExplainer(model)

# Compute SHAP values
shap_values = explainer(X_test)

# Visualize
shap.plots.beeswarm(shap_values)

Choose Explainer

# Tree-based models (XGBoost, LightGBM, RF) - FAST
explainer = shap.TreeExplainer(model)

# Deep learning (TensorFlow, PyTorch)
explainer = shap.DeepExplainer(model, background_data)

# Linear models
explainer = shap.LinearExplainer(model, X_train)

# Any model (slower but universal)
explainer = shap.KernelExplainer(model.predict, X_train[:100])

# Auto-select best explainer
explainer = shap.Explainer(model)

Compute SHAP Values

# Compute for test set
shap_values = explainer(X_test)

# Access components
shap_values.values      # SHAP values (feature attributions)
shap_values.base_values # Expected model output (baseline)
shap_values.data        # Original feature values

Visualizations

Global Feature Importance

# Beeswarm - shows distribution and importance
shap.plots.beeswarm(shap_values)

# Bar - clean summary
shap.plots.bar(shap_values)

Individual Predictions

# Waterfall - breakdown of single prediction
shap.plots.waterfall(shap_values[0])

# Force - additive visualization
shap.plots.force(shap_values[0])

Feature Relationships

# Scatter - feature vs SHAP value
shap.plots.scatter(shap_values[:, "feature_name"])

# With interaction coloring
shap.plots.scatter(shap_values[:, "Age"], color=shap_values[:, "Income"])

Heatmap (Multiple Samples)

shap.plots.heatmap(shap_values[:100])

Common Patterns

Complete Analysis

import shap

# 1. Create explainer and compute
explainer = shap.TreeExplainer(model)
shap_values = explainer(X_test)

# 2. Global importance
shap.plots.beeswarm(shap_values)

# 3. Top feature relationships
shap.plots.scatter(shap_values[:, "top_feature"])

# 4. Individual explanation
shap.plots.waterfall(shap_values[0])

Compare Groups

# Compare feature importance across groups
group_a = X_test['category'] == 'A'
group_b = X_test['category'] == 'B'

shap.plots.bar({
    "Group A": shap_values[group_a],
    "Group B": shap_values[group_b]
})

Debug Errors

# Find misclassified samples
errors = model.predict(X_test) != y_test
error_idx = np.where(errors)[0]

# Explain why they failed
for idx in error_idx[:5]:
    shap.plots.waterfall(shap_values[idx])

Interpret Values

Positive SHAP â Feature pushes prediction higher
Negative SHAP â Feature pushes prediction lower
Magnitude â Strength of impact
Sum of SHAP values = Prediction – Baseline

Baseline: 0.30
Age: +0.15
Income: +0.10
Education: -0.05
Prediction: 0.30 + 0.15 + 0.10 - 0.05 = 0.50

Best Practices

Use TreeExplainer for tree models (fast, exact)
Use 100-1000 background samples for KernelExplainer
Start global (beeswarm) then go local (waterfall)
Check model output type (probability vs log-odds)
Validate with domain knowledge

vs Alternatives

Tool	Best For
SHAP	Theoretically grounded, all model types
LIME	Quick local explanations
Feature Importance	Simple tree-based importance

Resources

Docs: https://shap.readthedocs.io/
Paper: Lundberg & Lee (2017) “A Unified Approach to Interpreting Model Predictions”

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台