NLP Diagnostics¶

Click for an important note concerning limited release of TruEra's NLP dianostics support.

TruEra Diagnostics for NLP data and models is supported by both the Python SDK and the TruEra Web App.

Engineered to analyze text and speech data, NLP (short for natural language processing) is machine learning technology that teaches computers to understand human language. At its most effective, NLP can work through differences in dialects, slang, and inconsistent/irregular grammar typical in everyday human conversation.

Handling voice and text data at scale from a range of communication channels — emails, text messages, social media newsfeeds, video, audio, and more — well-formed NLP models analyze conversational human language to determine its intent or sentiment and respond in real time.

Ingestion¶

Using the SDK's add_data() method, the ingestion experience for NLP data is similar to the experience for tabular projects with the principal difference being substitution of an NLPColumnSpec argument for ColumnSpec.

The following quickstart notebooks are available for the model framework listed:

These notebooks cover more on explainability:

For local analysis, use the guidance in this notebook to download a remote project:

NLP Remote Project Download to Local

Exploring NLP Diagnostics in the Web App¶

Explore additional diagnostics in the TruEra Web App, where the following pages function for NLP projects the same as for Tabular projects, save for the exceptions cited here:

Model Leaderboard
Model Summary
Test Harness
Exception: Only performance tests can be created in NLP projects
Performance
Exception: Limited Performance page experience
Segments
Exception: Segments can only be defined on extra data. Influence analysis is not currently available
Drift
Exception: Only Model Score Drift is available at the moment. Influence analysis is not yet available
Fairness
Exception: Contribution to bias using influence is not available

Click Next below to continue.