Simple Storage ServiceAWS's object storage service, offering scalability, data availability, security, and performance.
Ingesting Production Data¶
For most real-time monitoring use cases, a data pipeline delivers the data to be processed by your model. Data can be pushed in batch or pulled into the pipeline on a defined schedule.
Batch ingestion collects and transfers data to TruEra in batches, either at scheduled intervals or asynchronously, on demand. This is especially useful when you need to model specific data points on a daily basis or when the ingested data can be assembled in microbatches. Historically, most near real-time monitoring is done via microbatches.
Data for both types of production data ingestion — batch push or scheduled pull — come in essentially two flavors:
Structured data or tabular data — data in a database or data warehouse, commonly known for being highly organized so that it can be easily searched, changed, and analyzed.
Unstructured data — typically rich media like long-form text, audio and video accounts for 80% of data in enterprises and is often difficult to manage, store, and analyze because it doesn’t have a predefined format or structure, barring the capability to automatically organize information.
Essentially, TruEra monitoring dashboards can handle models of any type focused on model output, labels, custom metrics, and segment tags, although tracking data drift and data quality for tabular inputs is currently supported.
Click a link below to explore your options and determine what's suitable for your particular use case.
Or click Next below to continue.