A Scalable, Open-source Data Platform
Alex Merced

#Apache
#Iceberg
#Lakehouse
#Data
#Open-source
This book is designed to help architects, engineers, and data leaders move beyond surface-level awareness of Apache Iceberg and into a confident, production-ready implementation. While Iceberg is widely discussed, practical guidance on designing an end-to-end lakehouse around it remains limited. This book addresses that gap by breaking down each architectural layer, outlining design choices, and providing both conceptual frameworks and hands-on exercises. Rather than prescribe a one-size-fits-all deployment, the book emphasizes adaptability. You’ll learn how to assess your needs, weigh tradeoffs, and select tools that align with your business and technical goals.
Who should read this book
This book is for data architects, platform engineers, and senior data professionals responsible for modernizing data infrastructure or designing new analytical platforms. You should be familiar with the general concepts of data lakes, warehouses, and processing tools such as Apache Spark or Flink. No prior experience with Apache Iceberg is required, but familiarity with cloud storage, distributed systems, and SQL will help you get the most out of the material.
How this book is organized
A road map The book is divided into three parts:
Part 1 introduces the Apache Iceberg lakehouse concept, tracing its roots and showing how it improves upon traditional architectures. It concludes with a hands-on walkthrough that sets up a working Iceberg lakehouse on your local machine.
Part 2 walks through the major architectural layers of a lakehouse—storage, ingestion, catalog, federation, and consumption—guiding you through design decisions based on real-world needs.
Part 3 focuses on operations. It covers topics like maintenance, monitoring, and production practices that ensure your lakehouse remains reliable and scalable after launch.
How to use this book
You can read this book sequentially to follow the full architectural journey or skip to the parts most relevant to your current project. Each chapter is written to stand on its own, but cross-references help tie the material together. The hands-on sections provide working examples, while the architectural discussions are equally well suited to planning and design sessions. Whether you’re evaluating Iceberg for the first time or scaling an existing deployment, this book will help you move forward with clarity and purpose.
Table of Contents
Part 1 The value of the Apache Iceberg lakehouse
1 ■ The world of the data lakehouse
2 ■ Apache Iceberg and the lakehouse
3 ■ Hands-on with Apache Iceberg
Part 2 Designing your Iceberg architecture
4 ■ Preparing for your move to Apache Iceberg
5 ■ Selecting the storage layer
6 ■ Architecting the ingestion layer
7 ■ Implementing the catalog layer
8 ■ Designing the federation layer
9 ■ Understanding the consumption layer
Part 3 Operating your Apache Iceberg lakehouse
10 ■ Maintaining an Iceberg lakehouse
11 ■ Operationalizing Apache Iceberg
About the Author
Alex Merced is the Head of Developer Relations at Dremio, where he helps developers understand and adopt modern data lakehouse architectures. He focuses on open table formats, semantic layers, and cloud-native analytics platforms that enable scalable, AI-ready data systems. Alex runs DataLakehouseHub.com, a resource hub for engineers navigating the evolving lakehouse ecosystem.









