Skip to content

Ingesting Natively Supported Python Models

TruEra supports easy and direct upload of model objects for most popular Python modeling frameworks.

Tip

For the models listed below, you can directly ingest the model object via the Python SDK. Ingest a packaged model via either the Python client or CLI client if your model is not listed below.

Natively Supported Python Models
XGBoost
CatBoost
LightGBM (specifically lightgbm.Booster, lightgbm.LGBMClassifier, and lightgbm.LGBMRegressor models)
scikit-learn (specifically any sklearn model that implements a predict_proba function, including sklearn.pipeline)
PySpark tree (specifically GBTClassificationModel, RandomForestClassificationModel, DecisionTreeRegressionModel, and RandomForestRegressionModel models)
Most arbitrary Callable Python functions that map a pandas.DataFrame of feature values to a DataFrame of model outputs.

Ingesting Python Model Objects

For compatible models listed above, the best way to ingest the model is to use the Python SDK's TrueraWorkspace. add_python_model function and pass the model object directly.

===

``` Python
tru.set_environment("local") # this is the default option

model_name = <model name>

model = <modelObj> # replace with your model object

tru.add_python_model(model_name, model)
```

Local compute

If calling add_python_model, the model will be added to a local TruEra environment by default. In this mode, the project, model, and any associated data live only on the local machine. Generating predictions, influences, and lightweight analysis such as computing accuracy and feature importance are all supported locally. The local compute experience is generally faster than triggering influence jobs on a remote machine and is also a great way to sanity check whether the model will run in a remote environment prior to upload. But it does not support advanced options like pulling data splits from external sources or running all model types. For a full walkthrough of the local compute experience, please refer to the local flow notebook example.

tru.set_environment("remote")

model_name = <model name>

model = <modelObj> # replace with your model object

tru.add_python_model(model_name, model)

Adding Custom Python Models

With TruEra, you can conveniently add any model to the platform using the Python SDK simply by defining a prediction function and then adding it using add_python_model(), without the need for packaging.

Shown next, the code for this is straightforward.

def predict(X):
return <your custom model>.predict_function(X)

tru.add_python_model("custom_model_name", predict)

If for any reason your model object cannot be directly ingested, you can package it instead, then upload it into the TruEra ecosystem, as discussed next.

Packaging and Ingesting Python Models

Be aware that the local compute experience is not supported for models ingested this way, although you can still interact with them remotely via the TruEra Web App and Python SDK remote experience.

# Define your model object or function
def truera_predict_callable(x: pd.DataFrame):
    return model.predict(x)

# Specify empty directory in which to package model
packaged_model_path = "<path-to-empty-folder>"

# Create a skeleton packaged model with some additional dependencies
tru.create_packaged_python_model(packaged_model_path,
    model_obj=truera_predict_callable,
    additional_pip_dependencies=f"package_1=={package_1.__version__},
    package_2=={package_2.__version__},
    package_3=={package_3.__version__}")

# You can now edit your packaged model as necessary with custom logic!

# Locally verify the model was packaged correctly
tru.verify_packaged_model(packaged_model_path)

# Ingest the packaged model into TruEra
tru.add_packaged_python_model("model_name", packaged_model_path)

Package the model in one go as shown above or:

  1. Create a skeleton model wrapper via TrueraWorkspace.create_packaged_python_model()
  2. Edit the generated wrapper file to properly deserialize your model and implement the predict function.
  3. Add any Python package dependencies needed for deserialization and predictions into the generated conda.yaml file.
  4. Add the model to remote via TrueraWorkspace.add_packaged_python_model().

Click Next below to continue.