An Introduction to Python Explainers¶
Once your model is successfully ingested into your TruEra workspace, the Explainer
module of the Python SDK provides the tools and methods you can use to analyze and improve your models. In this guide, we'll cover the basics of Explainers and how to use them to ensure deeper trust in your models.
If you're looking for more information after reading this intro, check our the Python SDK Reference documentation.
Model Analysis in TruEra: Why use an explainer?¶
Once data and models are ingested into TruEra, a couple of tools will help you to begin understanding and analyzing your model: (a) the TruEra Web App and (b) the Python SDK's Explainer
module. The latter adds API support to the capabilitites of the Web App and is a valuable for:
- Enabling iterative data science workflows to train, ingest, and analyze your model in a single notebook, without having to switch to the Web App.
- Interacting more closely with explanations by retrieving feature influences and other analytics directly from a Python environment.
- Utilizing local compute mode for ingesting and analyzing your model locally, without having to upload it to a remote environment.
First things first
Before continuing with this tutorial, make sure you have ingested a project, model, and some data into the TruEra system. We'll also assume that you have installed and set up our Python SDK. If you haven't, this is a good time to do it.
Using an explainer¶
To use an explainer, you'll first need to create an explainer object, then set its context.
Create an Explainer Object¶
The form factor for local and remote explainers are nearly identical. At a high level, an explainer is pinned to a given model that generates predictions. Thus, to get an explainer object, first set a model in the workspace context and then use the get_explainer()
function as below:
tru.set_project("<MY PROJECT NAME>")
tru.set_model("<MY MODEL NAME>")
explainer = tru.get_explainer()
Set the Explainer Context¶
The explainer
object now allows you to compute analytics for the given model on one or more splits within the model's associated data collection. An explainer typically has a single base split, which is the primary split on which to benchmark a model, and one or mode comparison splits, which may be used to benchmark against the base. As an example, consider how we might benchmark ROC-AUC for multiple splits:
explainer.set_base_data_split("<MY BASE SPLIT>")
# get AUC for base split only
base_split_auc = explainer.compute_performance("AUC")
explainer.set_comparison_data_splits(["<MY COMPARISON SPLIT>"])
# get pd.DataFrame of AUCs for base + compare splits
all_aucs = explainer.compute_performance("AUC")
To set the explainer context for an existing explainer object, use the Explainer.set_base_data_split
and Explainer.set_comparison_data_splits
functions. If you instead wish to instantiate an explainer object with base and comparison splits directly, you may do so as well, i.e. via
explainer = tru.get_explainer
(
base_data_split="<MY BASE SPLIT>",
comparison_data_splits=["<MY COMPARISON SPLIT>"]
)
The corresponding Explainer.get_base_data_split
and Explainer.get_comparison_data_splits
functions will also return the base and comparison data splits that are currently set within the explainer context.
Note
To reset an explainer's context, such as its base or comparison split, you can retrieve a new "cleared" explainer object via tru.get_explainer()
. Alternatively, you can pass None
into the relevant setter function, e.g. via explainer.set_comparison_data_splits(None)
.
The Explainer in Action¶
For guided exercises on using an explainer in a model development workflow, please see our sample notebooks on drift/stability and fairness.
Local/Remote Explainers vs. the TruEra Web App¶
The easiest and quickest way to interact and analyze your model is via the local compute experience. However, some advanced capabilities may only be available in a remote Python SDK instance or Web App settings. See what's available below to determine which use case is best for you:
Feature | Local Explainer | Remote Explainer | Web App |
---|---|---|---|
Explainer.compute_feature_influences 1 |
|||
Explainer.get_global_feature_importances |
|||
Explainer.get_feature_influences_for_data 2 |
|||
Explainer.plot_isps |
|||
Explainer.plot_pdps |
|||
Explainer.get_partial_dependencies |
|||
Explainer.get_spline_fitter |
|||
Explainer.compute_performance |
|||
Explainer.rank_performance |
|||
Explainer.compute_model_score_instability |
|||
Explainer.compute_fairness |
|||
Analyze by segment |
|||
Create/manage segments |
———————
1. Used for an existing split;
2. Used for new, non-split data