Ingesting Natively Supported Python Models¶

TruEra supports easy and direct upload of model objects for most popular Python modeling frameworks.

Tip

For the models listed below, you can directly ingest the model object via the Python SDK. Ingest a packaged model via either the Python client or CLI client if your model is not listed below.

Natively Supported Python Models
`XGBoost`
`CatBoost`
`LightGBM` (specifically `lightgbm.Booster`, `lightgbm.LGBMClassifier`, and `lightgbm.LGBMRegressor` models)
`scikit-learn` (specifically any `sklearn` model that implements a `predict_proba` function, including `sklearn.pipeline`)
`PySpark` tree (specifically `GBTClassificationModel`, `RandomForestClassificationModel`, `DecisionTreeRegressionModel`, and `RandomForestRegressionModel` models)
Most arbitrary `Callable` Python functions that map a pandas.DataFrame of feature values to a DataFrame of model outputs.

Ingesting Python Model Objects¶

For compatible models listed above, the best way to ingest the model is to use the Python SDK's TrueraWorkspace. add_python_model function and pass the model object directly.

===

``` Python
tru.set_environment("local") # this is the default option

model_name = <model name>

model = <modelObj> # replace with your model object

tru.add_python_model(model_name, model)
```

Local compute

If calling add_python_model, the model will be added to a local TruEra environment by default. In this mode, the project, model, and any associated data live only on the local machine. Generating predictions, influences, and lightweight analysis such as computing accuracy and feature importance are all supported locally. The local compute experience is generally faster than triggering influence jobs on a remote machine and is also a great way to sanity check whether the model will run in a remote environment prior to upload. But it does not support advanced options like pulling data splits from external sources or running all model types. For a full walkthrough of the local compute experience, please refer to the local flow notebook example.

Remote

tru.set_environment("remote")

model_name = <model name>

model = <modelObj> # replace with your model object

tru.add_python_model(model_name, model)

Adding Custom Python Models¶

With TruEra, you can conveniently add any model to the platform using the Python SDK simply by defining a prediction function and then adding it using add_python_model(), without the need for packaging.

Shown next, the code for this is straightforward.

SDK

def predict(X):
return <your custom model>.predict_function(X)

tru.add_python_model("custom_model_name", predict)

If for any reason your model object cannot be directly ingested, you can package it instead, then upload it into the TruEra ecosystem, as discussed next.

Packaging and Ingesting Python Models¶

Be aware that the local compute experience is not supported for models ingested this way, although you can still interact with them remotely via the TruEra Web App and Python SDK remote experience.

SDK

# Define your model object or function
def truera_predict_callable(x: pd.DataFrame):
    return model.predict(x)

# Specify empty directory in which to package model
packaged_model_path = "<path-to-empty-folder>"

# Create a skeleton packaged model with some additional dependencies
tru.create_packaged_python_model(packaged_model_path,
    model_obj=truera_predict_callable,
    additional_pip_dependencies=f"package_1=={package_1.__version__},
    package_2=={package_2.__version__},
    package_3=={package_3.__version__}")

# You can now edit your packaged model as necessary with custom logic!

# Locally verify the model was packaged correctly
tru.verify_packaged_model(packaged_model_path)

# Ingest the packaged model into TruEra
tru.add_packaged_python_model("model_name", packaged_model_path)

Package the model in one go as shown above or:

Create a skeleton model wrapper via TrueraWorkspace.create_packaged_python_model()
Edit the generated wrapper file to properly deserialize your model and implement the predict function.
Add any Python package dependencies needed for deserialization and predictions into the generated conda.yaml file.
Add the model to remote via TrueraWorkspace.add_packaged_python_model().

Click Next below to continue.