Ingesting Natively Supported Python Models¶
TruEra supports easy and direct upload of model objects for most popular Python modeling frameworks.
Tip
For the models listed below, you can directly ingest the model object via the Python SDK. Ingest a packaged model via either the Python client or CLI client if your model is not listed below.
Natively Supported Python Models |
---|
XGBoost |
CatBoost |
LightGBM (specifically lightgbm.Booster , lightgbm.LGBMClassifier , and lightgbm.LGBMRegressor models) |
scikit-learn (specifically any sklearn model that implements a predict_proba function, including sklearn.pipeline ) |
PySpark tree (specifically GBTClassificationModel , RandomForestClassificationModel , DecisionTreeRegressionModel , and RandomForestRegressionModel models) |
Most arbitrary Callable Python functions that map a pandas.DataFrame of feature values to a DataFrame of model outputs. |
Ingesting Python Model Objects¶
For compatible models listed above, the best way to ingest the model is to use the Python SDK's TrueraWorkspace. add_python_model
function and pass the model object directly.
===
``` Python
tru.set_environment("local") # this is the default option
model_name = <model name>
model = <modelObj> # replace with your model object
tru.add_python_model(model_name, model)
```
Local compute
If calling add_python_model
, the model will be added to a local TruEra environment by default. In this mode, the project, model, and any associated data live only on the local machine. Generating predictions, influences, and lightweight analysis such as computing accuracy and feature importance are all supported locally. The local compute experience is generally faster than triggering influence jobs on a remote machine and is also a great way to sanity check whether the model will run in a remote environment prior to upload. But it does not support advanced options like pulling data splits from external sources or running all model types. For a full walkthrough of the local compute experience, please refer to the local flow notebook example.
tru.set_environment("remote")
model_name = <model name>
model = <modelObj> # replace with your model object
tru.add_python_model(model_name, model)
Adding Custom Python Models¶
With TruEra, you can conveniently add any model to the platform using the Python SDK simply by defining a prediction function and then adding it using add_python_model()
, without the need for packaging.
Shown next, the code for this is straightforward.
def predict(X):
return <your custom model>.predict_function(X)
tru.add_python_model("custom_model_name", predict)
If for any reason your model object cannot be directly ingested, you can package it instead, then upload it into the TruEra ecosystem, as discussed next.
Packaging and Ingesting Python Models¶
Be aware that the local compute experience is not supported for models ingested this way, although you can still interact with them remotely via the TruEra Web App and Python SDK remote experience.
# Define your model object or function
def truera_predict_callable(x: pd.DataFrame):
return model.predict(x)
# Specify empty directory in which to package model
packaged_model_path = "<path-to-empty-folder>"
# Create a skeleton packaged model with some additional dependencies
tru.create_packaged_python_model(packaged_model_path,
model_obj=truera_predict_callable,
additional_pip_dependencies=f"package_1=={package_1.__version__},
package_2=={package_2.__version__},
package_3=={package_3.__version__}")
# You can now edit your packaged model as necessary with custom logic!
# Locally verify the model was packaged correctly
tru.verify_packaged_model(packaged_model_path)
# Ingest the packaged model into TruEra
tru.add_packaged_python_model("model_name", packaged_model_path)
Package the model in one go as shown above or:
- Create a skeleton model wrapper via
TrueraWorkspace.create_packaged_python_model()
- Edit the generated wrapper file to properly deserialize your model and implement the
predict
function. - Add any Python package dependencies needed for deserialization and predictions into the generated
conda.yaml
file. - Add the model to remote via
TrueraWorkspace.add_packaged_python_model()
.
Click Next below to continue.