Understanding Data Lakehouse with Apache Iceberg, Apache Hudi & Delta Lake
There is a lot of buzz around data lakehouse architecture today, which unifies two mainstream data architectures - data warehouse & data lakes - promising to do more with less. On the other hand, all major data warehouse vendors have embraced the use of open table formats, due to customer demand for the flexibility & openness promised by supporting an open format.
Three projects - Apache Iceberg, Apache Hudi, and Delta Lake - are now at the center of all the attention and vendor chess moves in this space. These projects are pivotal in forging an open, adaptable foundation for your data that allows enterprises to choose appropriate compute specific to their unique workloads, thus avoiding the constraints of proprietary storage formats. However, the increasing usage of the terms open table format & open data lakehouse, used interchangeably across these projects, necessitates clarification and a deeper understanding.
In this session, we will do a technical breakdown of the lakehouse architecture (with code) & understand what actually brings openness.
20 Mar
10:10 am
-
10:20 am PST