Skip to content

Model Ingestion

Similar to data ingestion, TruEra supports two clients — the Python SDK and TruEra's CLI — for model ingestion/import with multiple paths available depending on whether your model is already packaged and how. If you're a new user, you may want to do a quick review of the general project structure supported by TruEra.

Basic ingestion commands and concepts can be explored in Step 2 – 5 under Adding a Project.

First things first

Before ingesting a model, you'll first need to ingest a data collection and split structure. See Data Ingestion for guidance.

Using the Python SDK for Model Ingestion

For ingestion of Python-based models, use the SDK's TrueraWorkspace.add_python_model() method if your model isn't packaged. Use TrueraWorkspace.add_packaged_python_model() if it is packaged. See the diagnostics quickstart or Python SDK Reference for instructions.

Using the CLI for Model Ingestion

Use the CLI command tru add model to ingest a model into the TruEra system. For guidance on usage and associated functions, such as attaching a data collection to a particular model or packaging a particular model type, see the CLI reference.

Which is the Best Way?

Recommendations for Python and Java models are available from the respective links. If these solutions are not successful, you can ingest a virtual model, which allows you to upload model predictions and feature influences without access to an executable model.

Post-ingestion Feature Transformations

Using QII for influences you can wrap any set of model transformations and provide influences with respect to pre-transformed (human-readable) features.

Complex transformations like dimensionality reductions (e.g., PCA) can be packaged as a part of the model object itself. However, there are optimizations that can be enabled for simpler one-to-many feature transformations that map a single pre-transform feature to a unique set of post-transform features. Examples of such feature transformations include normalizations (z-scoring, mean corrections), one/multi-hot encodings, imputations, and beyond.

To enable these optimizations, post-ingestion, you have two options (the first is recommended):

  1. Python models – if the transformation from raw human-readable data to model-readable data can be expressed as a function, add this function to the packaged model wrapper as an additional transform function. Details can be found in Custom Data Transformation.

  2. Java models – if the transformation cannot be simply expressed as a function, you can instead capture data before and after the transformation and ingest them as pre-transform data and post-transform data.

In both cases, add feature mapping from the columns of pre-transformed data to the post-transformed data. This can be done via CLI or with the Python SDK during data collection creation.

Post-ingestion Linear and Tree Model Optimizations

TruEra optimizations for tree-based and scikit-learn sklearn.linear_mode are enabled by ingesting the model object directly using the Python SDK. Or, you can add a get_model function to your packaged model wrapper (see Model Packaging and Execution).

Click Next below to continue.