LIME helps explain specific predictions made by machine learning models by focusing on individual data points. Instead of providing a global explanation, LIME offers insights into why a particular decision or prediction was made. This makes it extremely valuable for debugging and improving model transparency.
```pythonimport limeimport lime.lime_tabular# Sample data import and model training# Here, 'model' is a pre-trained model objectdata, labels = ... # Your dataset and labelsexplainer = lime.lime_tabular.LimeTabularExplainer(training_data=data,feature_names=['feature1', 'feature2', ..., 'featureN'],class_names=['class1', 'class2'],mode='classification')sample_instance = data[0] # Example instance from datasetexplained_instance = explainer.explain_instance(sample_instance,model.predict_proba,num_features=5)# Display the explanationexplained_instance.show_in_notebook()```
LIME
ExplainedLIME
(Local Interpretable Model-agnostic Explanations) allows you to understand complex models by creating variations of a data point and observing the model’s predictions. A simple, interpretable model, often linear regression, is then used to approximate the complex model’s behavior near these data points.
from lime import lime_tabularimport numpy as np# Assume `model` is your complex model and `data` is your data setexplainer = lime_tabular.LimeTabularExplainer(training_data=np.array(data),mode='classification')# Explanation for a single instanceexplanation = explainer.explain_instance(data[0], model.predict_proba)# Display top features that contribute to the predictionexplanation.show_in_notebook(show_all=False)
LIME uses specialized explainers to make model predictions understandable for various data types. There are LimeTabularExplainer
for tabular data, LimeTextExplainer
for text, and LimeImageExplainer
for images. This allows us to generate interpretable explanations tailored to different contexts.
from lime.lime_tabular import LimeTabularExplainer# Assume model and data are already definedexplainer = LimeTabularExplainer(training_data=data,feature_names=feature_names,class_names=class_names,verbose=True,mode='classification')exp = explainer.explain_instance(data_instance, model.predict_proba, num_features=5)exp.show_in_notebook(show_table=True, show_all=False)
In LIME, the discretize_continuous
parameter allows users to choose whether to bin continuous variables when generating explanations. This impacts how these explanations are understood, as binning changes variable interpretation. Experiment with this parameter to see its effect on your tabular data analysis.
from lime.lime_tabular import LimeTabularExplainer# Sample data and model setupdata = [[5.1, 3.5, 1.4, 0.2], [4.9, 3.0, 1.4, 0.2]]class_names = ['setosa', 'versicolor', 'virginica']feature_names = ['sepal length', 'sepal width', 'petal length', 'petal width']model = YourModel()# Initialize the explainer with discretize=Truediscretize_explainer = LimeTabularExplainer(data,feature_names=feature_names,class_names=class_names,discretize_continuous=True)# Initializing explainer without discretizediscrete_explainer = LimeTabularExplainer(data,feature_names=feature_names,class_names=class_names,discretize_continuous=False)# Explanation for a single instancediscretize_explanation = discretize_explainer.explain_instance(data[0], model.predict_proba)discrete_explanation = discrete_explainer.explain_instance(data[0], model.predict_proba)
In LIME, feature importance is gauged by the coefficients of a weighted local model. The importance comes from the distances between perturbed and original instances. LIME
helps interpret complex models by elucidating which features are most influential in predictions. Check out the function below on implementing basic LIME to understand feature significance.
from lime.lime_tabular import LimeTabularExplainerimport numpy as np# Sample data and modelX_train = np.array([[0, 1, 0], [1, 1, 0], [1, 0, 1]])model = lambda x: np.array([[0.7, 0.3], [0.4, 0.6], [0.9, 0.1]])# Creating LIME explainerexplainer = LimeTabularExplainer(training_data=X_train,feature_names=['Height', 'Weight', 'Age'],class_names=['NotSpam', 'Spam'])# Explaining a predictionexp = explainer.explain_instance(data_row=X_train[0],predict_fn=model,labels=[0, 1])exp.show_in_notebook()
Python
Image LIMELIME identifies crucial parts influencing model predictions in images by creating superpixels, which are groups of neighboring pixels. These superpixels are then perturbed to assess their effect on predictions. Each modified region informs the algorithm on the importance of that area.
import limeimport lime.lime_image# Assume `image` is the image data to be explainedexplainer = lime.lime_image.LimeImageExplainer()# Assume `model` is your trained model which returns class probabilitiesexplanation = explainer.explain_instance(image,model.predict,top_labels=5,hide_color=0,num_samples=1000)# Displaying the explanation for the top labelfrom skimage.segmentation import mark_boundariesimage_boundaries = mark_boundaries(image, explanation.segments)""" Use matplotlib to display image_boundaries to visualizewhich superpixels impacted the prediction. """
LIME (Local Interpretable Model-agnostic Explanations) is a tool that explains text classification results. It highlights words or phrases impacting the prediction. By observing prediction changes when certain words are removed, LIME helps identify positive or negative word contributions.
```pythonfrom lime.lime_text import LimeTextExplainerfrom sklearn.pipeline import make_pipelineimport sklearn# Sample text and a placeholder modeltext = "The movie was amazing and brilliantly shot."model = make_pipeline(vectorizer, classifier)# Initialize the explainerexplainer = LimeTextExplainer(class_names=["Negative", "Positive"])# Generate explanation for a sample textexplanation = explainer.explain_instance(text, model.predict_proba)# Show the words contributing to the predictionexplanation.show_in_notebook(text=text)```
Python
Classification ModelsLIME is a tool that provides explanations for regression and classification models. In classification models, LIME can work in both probability and logit space. This makes it a powerful choice for understanding complex modeling processes. The code example demonstrates using LIME with a classifier model to generate insights on data features.
```pythonfrom sklearn.ensemble import RandomForestClassifierfrom lime import lime_tabular# Sample datasetX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)# Initialize the modelrf_classifier = RandomForestClassifier(n_estimators=100, random_state=42)rf_classifier.fit(X_train, y_train)# Initialize Limeexplainer = lime_tabular.LimeTabularExplainer(X_train.values,feature_names=X_train.columns,class_names=["class_0", "class_1"],mode='classification')# Generate explanationexp = explainer.explain_instance(X_test.iloc[0], rf_classifier.predict_proba, num_features=5)exp.show_in_notebook()```
LIME approximates features’ contributions to a model’s prediction using locally weighted linear models. Unlike SHAP, LIME does not aim for contributions to sum exactly to the prediction difference. This makes it ideal for studying local model behavior without exact decomposition, offering insight into a model’s local decision boundaries.
from lime.lime_tabular import LimeTabularExplainerimport numpy as np# Example model and data setupdef model_predict(data):# Dummy prediction functionreturn np.array([0.2, 0.8])# Generating some example datafeature_names = ['Feature 1', 'Feature 2']data_row = np.array([5, 7])training_data = np.array([data_row for _ in range(5)])# Initialize LIME explainerexplainer = LimeTabularExplainer(training_data,feature_names=feature_names,verbose=True,mode='classification')# Get explanation for a specific predictionexp = explainer.explain_instance(data_row,model_predict,num_features=2)# Display the explanationexp.show_in_notebook(show_all=False)