نام کتاب
Scaling Python with Dask

From Data Science to Machine Learning

Holden Karau, Mika Kimmins

Paperback226 Pages
PublisherO'Reilly
Edition1
LanguageEnglish
Year2023
ISBN978\1098119874
985
A3266
انتخاب نوع چاپ:
جلد سخت
469,000ت
0
جلد نرم
409,000ت
0
طلق پاپکو و فنر
419,000ت
0
مجموع:
0تومان
کیفیت متن:اورجینال انتشارات
قطع:B5
رنگ صفحات:دارای متن و کادر رنگی
پشتیبانی در روزهای تعطیل!
ارسال به سراسر کشور

#Python

#Dask

#Data_Science

#Machine_Learning

#open_source

#PyData

#GPU

#Harvard

#NASA

توضیحات

Modern systems contain multi-core CPUs and GPUs that have the potential for parallel computing. But many scientific Python tools were not designed to leverage this parallelism. With this short but thorough resource, data scientists and Python programmers will learn how the Dask open source library for parallel computing provides APIs that make it easy to parallelize PyData libraries including NumPy, pandas, and scikit-learn.


Authors Holden Karau and Mika Kimmins show you how to use Dask computations in local systems and then scale to the cloud for heavier workloads. This practical book explains why Dask is popular among industry experts and academics and is used by organizations that include Walmart, Capital One, Harvard Medical School, and NASA.


With this book, you'll learn:

  • What Dask is, where you can use it, and how it compares with other tools
  • How to use Dask for batch data parallel processing
  • Key distributed system concepts for working with Dask
  • Methods for using Dask with higher-level APIs and building blocks
  • How to work with integrated libraries such as scikit-learn, pandas, and PyTorch
  • How to use Dask with GPUs


Table of Contents

Chapter 1. What Is Dask?

Chapter 2. Getting Started with Dask

Chapter 3. How Dask Works: The Basics

Chapter 4. Dask DataFrame

Chapter 5. Dask's Collections

Chapter 6. Advanced Task Scheduling: Futures and Friends

Chapter 7. Adding Changeable/Mutable State with Dask Actors

Chapter 8. How to Evaluate Dask's Components and Libraries

Chapter 9. Migrating Existing Analytic Engineering

Chapter 1 0. Dask with GPUs and Other Special Resources

Chapter 11. Machine Learning with Dask

Chapter 12. Productionizing Dask: Notebooks, Deployment, Tuning, and Monitoring

Appendix A. Key System Concepts for Dask Users

Appendix 8. Scalable DataFrames: A Comparison and Some History

Appendix C. Debugging Dask

Appendix D. Streaming with Streamz and Dask


We wrote this book for data scientists and data engineers familiar with Python and pandas who are looking to handle larger-scale problems than their current tooling allows. Current PySpark users will find that some of this material overlaps with their existing knowledge of PySpark, but we hope they still find it helpful, and not just for getting away from the Java Virtual Machine (JVM).

If you are not familiar with Python, some excellent O’Reilly titles include 'Learning Python' and 'Python for Data Analysis'. If you and your team are more frequent users of JVM languages (such as Java or Scala), while we are a bit biased, we’d encourage you to check out Apache Spark along with 'Learning Spark' and 'High Performance Spark'.

This book is primarily focused on data science and related tasks because, in our opinion, that is where Dask excels the most. If you have a more general problem that Dask does not seem to be quite the right fit for, we would (with a bit of bias again) encourage you to check out 'Scaling Python with Ray' (O’Reilly), which has less of a data science focus.


A Note on Responsibility

As the saying goes, with great power comes great responsibility. Dask and tools like it enable you to process more data and build more complex models. It’s essential not to get carried away with collecting data simply for the sake of it, and to stop to ask yourself if including a new field in your model might have some unintended real-world implications. You don’t have to search very hard to find stories of well-meaning engineers and data scientists accidentally building models or tools that had devastating impacts, such as increased auditing of minorities, gender-based discrimination, or subtler things like biases in word embeddings (a way to represent the meanings of words as vectors). Please use your newfound powers with such potential consequences in mind, for one never wants to end up in a textbook for the wrong reasons.


About the Author

Holden Karau is a queer transgender Canadian, Apache Spark committer, Apache Software Foundation member, and an active open source contributor. As a software engineer, she's worked on a variety of distributed computing, search, and classification problems at Apple, Google, IBM, Alpine, Databricks, Foursquare, and Amazon. She graduated from the University of Waterloo with a bachelor of mathematics in computer science. Outside of software, she enjoys playing with fire, welding, riding scooters, eating poutine, and dancing.


Mika Kimmins is a data engineer, distributed systems researcher, and ML consultant. She worked on a variety of NLP, language modeling, reinforcement learning, and ML pipelining at scale as a Siri Data Engineer at Apple, an academic, and in not-for-profit engineering capacities. She is currently earning an MS in Engineering Science and an MBA from Harvard, and holds a BS in Computer Science and Mathematics from the University of Toronto. As a Korean-Canadian-American trans woman, Mika is active in data-driven advocacy for queer healthcare access, advises undergraduate Computer Science students, and attempts to keep her volunteer EMT courses current. Her hobbies include figure skating, aerial arts, and sewing. 

دیدگاه خود را بنویسید
نظرات کاربران (0 دیدگاه)
نظری وجود ندارد.
کتاب های مشابه
Python
1,010
C and Python Applications
426,000 تومان
Python
1,057
Mastering Python Networking
975,000 تومان
Python
973
Hadoop with Python
239,000 تومان
Reinforcement Learning
1,055
Deep Reinforcement Learning with Python
594,000 تومان
Python
1,198
Programming Python
2,431,000 تومان
Python
598
Essentials of Excel VBA, Python, and R: Volume II
998,000 تومان
Python
609
Violent Python
536,000 تومان
Python
1,168
Python for Probability, Statistics, and Machine Learning
897,000 تومان
Python
978
Algorithmic Short Selling with Python
575,000 تومان
Python
977
Quick Python 3
302,000 تومان
قیمت
منصفانه
ارسال به
سراسر کشور
تضمین
کیفیت
پشتیبانی در
روزهای تعطیل
خرید امن
و آسان
آرشیو بزرگ
کتاب‌های تخصصی
هـر روز با بهتــرین و جــدیــدتـرین
کتاب های روز دنیا با ما همراه باشید
آدرس
پشتیبانی
مدیریت
ساعات پاسخگویی
درباره اسکای بوک
دسترسی های سریع
  • راهنمای خرید
  • راهنمای ارسال
  • سوالات متداول
  • قوانین و مقررات
  • وبلاگ
  • درباره ما
چاپ دیجیتال اسکای بوک. 2024-2022 ©