Airbyte is an EL(T) platform that reads from a variety of data sources, taking out all the pains of ingesting data and writing it to a centralized data store. Ingestion in stores like data lakes and warehouses has become a common source of pain for the data warehouses and lakehouses. Many of these lakehouses use open file formats, such as CSV, JSON, and Parquet that can be processed by a potpourri of tools.
Trino, formerly PrestoSQL, is an MPP distributed SQL query engine for analytics at petabyte scale that helps you run transformations and queries on data lakes. Trino was commonly used for interactive analytics being the fastest query engine alternative but did not support fault-tolerance to avoid overhead. Trino’s recent addition of fault-tolerant execution has expanded it’s options and made it a more suitable engine to run ELT workloads.Airbyte and Trino offer a powerful end-to-end solution needed when ingesting and transforming data quickly at scale. In this talk, we want to chat about the history of Trino, ingesting data with Airbyte, and show an example of these two technologies in action.