Skip to content

Reading Local Files

TruEra supports ingesting records from local CSV and Parquet files.

Data Ingestion Basics

TruEra's file ingestion functionality relies on a user registering their file as a Table data source. Once, registered tables can be ingested using add_data() and add_production_data() like a dataframe.

The steps to ingest local files into TruEra are:

  1. Add the data source containing the Table object (see add_data_source()).
  2. Add the data in the table in order to create a data split or ingest to a production data stream (see add_data() and add_production_data()).
import pandas as pd

# Save pandas DataFrame as local CSV file
pd.DataFrame({
    "id": [1, 2, 3],
    "feature1": [0, 0, 1],
    "feature2": [1, 1, 0]
}).to_csv("data.csv")

# Add data source
data_source = tru.add_data_source("data.csv")

# Create split from data source
from truera.client.ingestion import ColumnSpec
tru.add_data(
    data_source,
    data_split_name="split_1",
    column_spec=ColumnSpec(
        id_col_name="id",
        pre_data_col_names=["feature1", "feature2"]
    )
)

Click Next below to continue.