Skip to content

Exploring NLP Explainability

This notebook explores TruEra's Covid Tweets NLP Demo project by showing you how to connect to a remote workspace, synchronize project data, and render widgets for notebook analysis.

In this exercise, you'll import the PyTorch device, connect to TruEra, then compare two models of different scale:

  • A sentiment classifier trained on 10k points
  • A sentiment classifier trained on 40k points.

First, we'll need to set the PyTorch framework like this:

import torch
device = torch.device("cpu")

Next, connect to Truera by:

  1. Providing the TruEra deployment URI — http://app.truera.net — as the connection string.
  2. Providing your connection token.
  3. Creating your Truera Workspace.
TRUERA_URL = "<TRUERA_URL>"
TOKEN = "<TOKEN>"
from truera.client.truera_workspace import TrueraWorkspace
from truera.client.truera_authentication import TokenAuthentication

auth = TokenAuthentication(TOKEN)
tru = TrueraWorkspace(TRUERA_URL, auth)
# Setup remote environment
tru.set_project("Covid Tweets")
tru.set_data_collection("Data Collection")

Interactive Widgets

If the following widgets do not load, check the required dependencies.

  • plotly ≥ 5.5
  • ipywidgets ≥ 7.7
  • notebook ≥ 6.4

Important

The kernel alone having these dependencies is insufficient. Make sure the Jupyter server's environment also has these dependencies installed.

You may need to close and re-open this notebook for changes to work the first time the server is started with these dependencies installed.

10k Records Model Analysis

Now load the test split, influences, and predictions generated from the 10k model. NLP SDK widgets visualize the most important tokens in each record and across the datasplit as a whole.

tru.set_model("bert-base-uncased_train_10k")
tru.set_data_split("test")
tru.get_explainer().global_token_summary()
tru.get_explainer().record_explanations_attribution_tab()

40k Model Analysis

For comparison, load the test split, influences, and predictions generated from the 40k model.

tru.set_model("bert-base-uncased_train_40k")
tru.set_data_split("test")
tru.get_explainer().global_token_summary()
tru.get_explainer().record_explanations_attribution_tab()