Skip to content

NLP Diagnostics

Click here for an important note concerning limited release of TruEra's NLP dianostics support.

TruEra Diagnostics for NLP data and models is supported by both the Python SDK and the TruEra Web App.

Engineered to analyze text and speech data, NLP (short for natural language processing) is machine learning technology that teaches computers to understand human language. At its most effective, NLP can work through differences in dialects, slang, and inconsistent/irregular grammar typical in everyday human conversation.

Handling voice and text data at scale from a range of communication channels — emails, text messages, social media newsfeeds, video, audio, and more — well-formed NLP models analyze conversational human language to determine its intent or sentiment and respond in real time.

Ingestion

Using the SDK's add_data() method, the ingestion experience for NLP data is similar to the experience for tabular projects with the principal difference being substitution of an NLPColumnSpec argument for ColumnSpec.

The following quickstart notebooks are available for the model framework listed:

These notebooks cover more on explainability:

For local analysis, use the guidance in this notebook to download a remote project:

Exploring NLP Diagnostics in the Web App

Explore additional diagnostics in the TruEra Web App, where the following pages function for NLP projects the same as for Tabular projects, save for the exceptions cited here:

  • Model Leaderboard
  • Model Summary
  • Test Harness
    Exception: Only performance tests can be created in NLP projects
  • Performance
    Exception: Limited Performance page experience
  • Segments
    Exception: Segments can only be defined on extra data. Influence analysis is not currently available
  • Drift
    Exception: Only Model Score Drift is available at the moment. Influence analysis is not yet available
  • Fairness
    Exception: Contribution to bias using influence is not available

Click Next below to continue.