We all know Airbyte, but how does it relate to Prefect?
The first principle was to make it ‘analytics engineer first’. We wanted it to be as integrated as possible with the existing workflows in dbt, be part of the jobs, configuration and alerting that is already in place.
In this session, we will cover how we connect the different components of the modern data stack (Airbyte for EL, dbt for T, Airflow for Orchestration, and Superset for viz) to create an end-to-end experience.
Ensuring data quality when moving it across databases is challenging due to the large volume and high frequency of changes in modern data stores.
Data quality has many symptoms but few causes
Join this session to learn how to scale your data preparation efforts by bringing impactful insights to the business with safe, reliable self-service analytics.
We tried multiple approaches with pros and cons, and were able to come up with a scalable solution that works well with our external YAML configured connection list file.
As data and analytics engineers, it’s easy to get caught up in the mechanics of the pipelines we built. How do we move data from one source to another? How do track changes and dependencies? How do we do it all faster, more reliably, and with more automation?
Data Integration is a crucial part of the data value chain; it is also where many anomalies can arise, causing the whole downstream processes to break and trust to be eroded.
We look at how open source projects can synergize for the benefit of their users.
To efficiently apply the necessary changes to a pipeline requires running it parallel to production to test the effect of a change. Most data engineers would agree that the best way to do this is far from a solved problem.
In this talk, we want to chat about the history of Trino, ingesting data with Airbyte, and show an example of these two technologies in action.