Batch Ingestion and Prediction Tagging¶
Ingesting batches of production data for monitoring is similar to diagnostics ingestion, albeit with the additional requirement of including a timestamp column. This timestamp should represent the event's prediction time.
Batch Ingestion¶
Ingesting production data in batch using TruEra's Python SDK takes the following form using the add_production_data()
method:
tru.add_production_data(
pd,
column_spec=ColumnSpec(
id_col_name="id",
pre_data_col_names=pre_data_names,
timestamp_col_name="prediction_time",
label_col_names=["label"]),
)
Prediction Tagging for Monitoring Segmentation¶
Because certain Monitoring dashboard panels must be filtered to exclusively reflect predictions having a given tag, you'll need to set up these views during dashboard creation in accordance with the following specification:
- Prediction tags can be an arbitrary string up to 30 characters in length.
- The maximum number of tags attached to a given prediction is 12.
- The data type of the input column must be either
string
or a list of strings.
Other data types will be implicitly converted during ingestion.
Ingest these tags during prediction ingestion by specifying tags_col_name
in the add_production_data()
call as shown next:
tru.add_production_data(
pd,
data_split_name="my-prod-split",
column_spec=ColumnSpec(
id_col_name="id",
pre_data_col_names=pre_data_names,
timestamp_col_name="prediction_time",
label_col_names=["label"],
tags_col_name="tags")
)
Click Next below to continue.