Transforming, Analyzing, and Visualizing Data with a Fast and Expressive DataFrame API
Jeroen Janssens and Thijs Nieuwdorp

#Python
#Polars
#Definitive_Guide
#Transforming
#Visualizing_Data
#DataFrame_API
#hvPlot
#plotnine
#CSV
قدرت پکیج Polars در پایتون را برای تبدیل، تحلیل و بصریسازی دادهها آزاد کنید. در این راهنمای عملی، Jeroen Janssens و Thijs Nieuwdorp شما را با تمام ویژگیهای Polars آشنا میکنند و نحوه استفاده از آن در وظایف واقعی مانند پاکسازی داده، تحلیل اکتشافی، ساخت پایپلاینها و موارد دیگر را آموزش میدهند.
چه متخصص باتجربه داده باشید و چه تازهوارد به علم داده، بهسرعت بر API قدرتمند و گویای Polars و مفاهیم پشت آن مسلط خواهید شد. نیازی به تجربه قبلی با pandas نیست، اما اگر داشته باشید، این کتاب انتقالی روان به Polars را برایتان فراهم میکند. بسیاری از مثالهای کاربردی و دادههای واقعی در GitHub در دسترس هستند تا بتوانید قدمبهقدم همراه شوید.
Unlock the power of Polars, a Python package for transforming, analyzing, and visualizing data. In this hands-on guide, Jeroen Janssens and Thijs Nieuwdorp walk you through every feature of Polars, showing you how to use it for real-world tasks like data wrangling, exploratory data analysis, building pipelines, and more.
Whether you're a seasoned data professional or new to data science, you'll quickly master Polars' expressive API and its underlying concepts. You don't need to have experience with pandas, but if you do, this book will help you make a seamless transition. The many practical examples and real-world datasets are available on GitHub, so you can easily follow along.
Who This Book Is For
This book is designed for anyone looking to leverage the power of Polars in Python to transform, analyze, and visualize data more efficiently and effectively. Whether you’re a seasoned data analyst, a data engineer, or even someone new to the world of data science, you’ll find valuable insights and practical examples that can be applied directly to real-world challenges. To illustrate the diverse ways in which Polars can benefit different users, let’s take a look at two key personas: Hanna, a seasoned data analyst, and Kosjo, an experienced data engineer.
Hanna: The Data Analyst
Hanna is a seasoned data analyst. She’s comfortable with Python and has a good grasp of pandas but occasionally struggles with its syntax and feels there must be a more elegant way to perform certain operations. Like many analysts, she regularly tackles exploratory data analysis (EDA) tasks that involve cleaning, transforming, and summarizing large datasets. However, she often finds herself battling with pandas’ sometimes complex and unintuitive syntax, especially when it comes to performing more advanced data manipulations or scaling her work to larger datasets.
For someone like Hanna, this book offers a streamlined, more intuitive alternative to pandas, with the added benefit of being able to handle data at a larger scale without sacrificing speed or readability. Polars provides a more Pythonic and performant way to perform the types of analyses Hanna does daily. By learning Polars, Hanna can simplify her workflow, write more elegant code, and unlock greater performance in her exploratory data analysis tasks.
Kosjo: The Data Engineer
Kosjo is an experienced data engineer, tasked with processing large volumes of data and building pipelines that support complex data workflows. They are highly skilled in Python and work with various technologies to ensure smooth data movement and processing. As part of their role, Kosjo is often responsible for optimizing processes to reduce infrastructure costs, especially when working with big data. This means reducing the time and resources required for heavy transformations without having to manage a distributed computing cluster.
Polars can help Kosjo achieve these goals. It is designed for speed and performance, especially when dealing with large datasets or intensive transformations. Its parallel execution model allows Kosjo to process data faster than traditional pandas, while its intuitive API keeps development simple. This book will guide Kosjo through leveraging Polars for complex data engineering tasks, enabling them to scale their workflows efficiently without the overhead of distributed systems or dealing with complex setup configurations.
A Broader Audience
In addition to these two personas, this book is also for data scientists, software engineers, and anyone else working with Python who is looking to explore the capabilities of Polars. Whether you’re handling small to medium-sized datasets or need to process terabytes of data, Polars offers a unified, high-performance approach to working with data. If you’re looking for a faster, more elegant way to analyze and manipulate your data without compromising on readability, this book will serve as a valuable resource to enhance your data-handling skills.
In summary, whether you’re looking to improve your day-to-day data analysis or streamline your data engineering workflows, Python Polars: The Definitive Guide is designed to help you unlock the full potential of Polars and solve data challenges with speed and elegance.
Polars has become a rising star in the Python data ecosystem, showing what's possible in a next-generation data frame library. Jeroen and Thijs have written a timely and essential resource to help you take advantage of everything Polars has to offer.
— Wes McKinney, Creator of pandas, Principal Architect, Posit PBC
Jeroen and Thijs have done an excellent job-not only teaching you the ins and outs of Polars but also helping you unlearn habits from other tools like pandas. They really bring out the power of expressions, which are key to using Polars effectively, guiding you toward a more declarative, functional approach to data processing. As you work through this book, I'm sure you'll gain a deep understanding of Polars and discover fresh ways to approach data processing.
—Ritchie Vink, Creator of Polars (excerpt from the Foreword)
Polars has brought a ton of much-needed innovation to the data frame world with its much more streamlined API and efficient implementation. As a result, the capabilities of data analysis in Python are pushed to new heights. We also greatly enjoy Ritchie and team as a part of the Amsterdam data ecosystem.
I greatly respect Jeroen's commitment to teaching data science in an accessible way, whether it be on the command line or elsewhere. His and Thijs' book is a testament to this commitment and I recommend it to the data science community.
— Hannes Mühleisen, Co-Creator of DuckDB
Table of Contents
Part I. Begin
Chapter 1. Introducing Polars
Chapter 2. Getting Started
Chapter 3. Moving from pandas to Polars
Part II. Form
Chapter 4. Data Structures and Data Types
Chapter 5. Eager and Lazy APIs
Chapter 6. Reading and Writing Data
Part III. Express
Chapter 7. Beginning Expressions
Chapter 8. Continuing Expressions
Chapter 9. Combining Expressions
Part IV. Transform
Chapter 10. Selecting and Creating Columns
Chapter 11. Filtering and Sorting Rows
Chapter 12. Working with Textual, Temporal, and Nested Data Types
Chapter 13. Summarizing and Aggregating
Chapter 14. Joining and Concatenating
Chapter 15. Reshaping
Part V. Advance
Chapter 16. Visualizing Data
Chapter 17. Extending Polars
Chapter 18. Polars Internals
About the Author
Jeroen Janssens, PhD, is a Senior Developer Relations Engineer at Posit, PBC. His expertise lies in visualizing data, implementing machine learning models, and building solutions using Python, R, JavaScript, and Bash. He's passionate about open source and sharing knowledge. He's the author of Python Polars: The Definitive Guide (O'Reilly, 2025) and Data Science at the Command Line (O'Reilly, 2021). Jeroen holds a PhD in machine learning from Tilburg University and an MSc in artificial intelligence from Maastricht University. He lives with his wife and two kids in Rotterdam, the Netherlands.









