Enriching Apache Iceberg Data Lakehouses with an Open Source Catalog
Alex Merced, Andrew Madson, Tomer Shiran

#Apache
#Polaris
#Data
#Lakehouses
#Dremio
📘 درک خود را از مدیریت دادههای مدرن با Apache Polaris (در مرحلهٔ انکوباسیون) متحول کنید — کاتالوگ متنباز طراحیشده برای استاندارد صنعتی دریاچهخانهٔ داده Apache Iceberg. این راهنمای جامع، شما را در سفری از میان پیچیدگیهای دریاچهخانههای دادهٔ Iceberg هدایت میکند و نقش محوری کاتالوگهای Iceberg را برجسته میسازد.
نویسندگان الکس مرسد، اندرو مدسن، و تومر شیران معماری و ویژگیهای Apache Polaris را بهتفصیل بررسی میکنند تا شما را برای بهرهگیری کامل از توانمندیهای آن تجهیز کنند. مهندسان داده، معماران داده، دانشمندان داده و تحلیلگران داده یاد میگیرند چگونه Apache Polaris را بهصورت یکپارچه با ابزارهای محبوبی چون Apache Spark، Snowflake، و Dremio ادغام کنند تا قابلیتهای مدیریت داده را ارتقا دهند، جریانهای کاری را بهینه سازند و مجموعهدادهها را ایمن کنند.
🔹 آشنایی جامع با دریاچهخانههای داده Iceberg
🔹 درک نقش کاتالوگها در مدیریت و پرسوجوی کارآمد دادهها در Iceberg
🔹 بررسی معماری منحصربهفرد و ویژگیهای قدرتمند Apache Polaris
🔹 استقرار Apache Polaris بهصورت محلی یا نسخهٔ مدیریتشده از طریق Snowflake و Dremio
🔹 اجرای عملیات پایه روی جدولها در Apache Spark، Snowflake، و Dremio
Revolutionize your understanding of modern data management with Apache Polaris (incubating), the open source catalog designed for data lakehouse industry standard Apache Iceberg. This comprehensive guide takes you on a journey through the intricacies of Apache Iceberg data lakehouses, highlighting the pivotal role of Iceberg catalogs.
Authors Alex Merced, Andrew Madson, and Tomer Shiran explore Apache Polaris's architecture and features in detail, equipping you with the knowledge needed to leverage its full potential. Data engineers, data architects, data scientists, and data analysts will learn how to seamlessly integrate Apache Polaris with popular data tools like Apache Spark, Snowflake, and Dremio to enhance data management capabilities, optimize workflows, and secure datasets.
Table of Contents
Part I. Data Lakehouses and Apache Iceberg Fundamentals
Chapter 1. Data Lakehouse and Apache Iceberg
Chapter 2. The Role of Apache Iceberg Catalogs
Part Il. Apache Polaris
Chapter 3. The Apache Polaris Security Model
Chapter 4. External Catalogs
Chapter 5. Polaris REST API
Part Ill. Hands-on with Apache Polaris
Chapter 6. Working with Apache Polaris OSS
Chapter 7. Using Apache Polaris with Apache Spark
Chapter 8. Using Apache Polaris with Snowflake
Chapter 9. Using Apache Polaris with Dremio
Chapter 10. Advanced Polaris Configuration and CU Management
Chapter 11. Looking to the Future of Apache Polaris lnnf!X
Alex Merced is a senior technical evangelist at Dremio with experience as a developer and instructor. His professional journey includes roles at GenEd Systems, Crossfield Digital, CampusGuard, and General Assembly. He co-authored "Apache Iceberg: The Definitive Guide" published by O'Reilly and has spoken at notable events such as Data Day Texas and Data Council. Alex is passionate about technology, sharing his expertise through blogs, videos, podcasts like Datanation and Web Dev 101, and contributions to the JavaScript and Python communities with libraries like SencilloDB and CoquitoJS.
Andrew Madson is an experienced data leader with 17 years of experience leading technical teams. Currently the Head of Evangelism and Education at Tobiko - the creators of SQLMesh and SQLGlot, Andrew has held senior leadership positions at institutions such as JP Morgan, LPL Financial, MassMutual, and Arizona State University. In addition to leading data teams, Andrew is a professor of data science and analytics at several universities, where he teaches graduate courses in machine learning, statistics, SQL, R, Python, Tableau, and Power BI.
Tomer Shiran is the Founder and Chief Product Officer of Dremio, an open data lakehouse platform that enables companies to run analytics in the cloud without the cost, complexity and lock-in of data warehouses. As the company's founding CEO, Tomer built a world-class organization that has raised over $400M and now serves hundreds of the world's largest enterprises, including 3 of the Fortune 5. Prior to Dremio, Tomer was the 4th employee and VP Product of MapR, a Big Data analytics pioneer. He also held numerous product management and engineering roles at Microsoft and IBM Research, founded several websites that have served millions of users and hundreds of thousands of paying customers, and is a successful author and presenter on a wide range of industry topics. He holds an MS in Computer Engineering from Carnegie Mellon University and a BS in Computer Science from Technion - Israel Institute of Technology.









