نام کتاب
Scaling Machine Learning with Spark

Distributed ML with MLlib, TensorFlow, and PyTorch

Adi Polak

Paperback294 Pages
PublisherO'Reilly
Edition1
LanguageEnglish
Year2023
ISBN9781098106829
963
A2786
انتخاب نوع چاپ:
جلد سخت
544,000ت
0
جلد نرم
484,000ت
0
طلق پاپکو و فنر
494,000ت
0
مجموع:
0تومان
کیفیت متن:اورجینال انتشارات
قطع:B5
رنگ صفحات:دارای متن و کادر رنگی
پشتیبانی در روزهای تعطیل!
ارسال به سراسر کشور

#Spark

#Machine_Learning

#MLflow

#TensorFlow

#PyTorch

#MLlib

#deep_learning

توضیحات

Learn how to build end-to-end scalable machine learning solutions with Apache Spark. With this practical guide, author Adi Polak introduces data and ML practitioners to creative solutions that supersede today's traditional methods. You'll learn a more holistic approach that takes you beyond specific requirements and organizational goals--allowing data and ML practitioners to collaborate and understand each other better.


Scaling Machine Learning with Spark examines several technologies for building end-to-end distributed ML workflows based on the Apache Spark ecosystem with Spark MLlib, MLflow, TensorFlow, and PyTorch. If you're a data scientist who works with machine learning, this book shows you when and why to use each technology.


You will:

  • Explore machine learning, including distributed computing concepts and terminology
  • Manage the ML lifecycle with MLflow
  • Ingest data and perform basic preprocessing with Spark
  • Explore feature engineering, and use Spark to extract features
  • Train a model with MLlib and build a pipeline to reproduce it
  • Build a data system to combine the power of Spark with deep learning
  • Get a step-by-step example of working with distributed TensorFlow
  • Use PyTorch to scale machine learning and its internal architecture



Table of Contents

Chapter 1. Distributed Machine Learning Terminology and Concepts

Chapter 2. Introduction to Spark and PySpark

Chapter 3. Managing the Machine Learning Experiment lifecycle with Mlflow

Chapter 4. Data Ingestion, Preprocessing, and Descriptive Statistics

Chapter 5. Feature Engineering

Chapter 6. Training Models with Spark Mllib

Chapter 7. Bridging Spark and Deep Learning Frameworks

Chapter 8. TensorFlow Distributed Machine Learning Approach

Chapter 9. PyTorch Distributed Machine Learning Approach

Chapter 10. Deployment Patterns for Machine Learning Models


From the Preface

Welcome to Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch. This book aims to guide you in your journey as you learn more about machine learning (ML) systems. Apache Spark is currently the most popular framework for large-scale data processing. It has numerous APIs implemented in Python, Java, and Scala and is used by many powerhouse companies, including Netflix, Microsoft, and Apple. PyTorch and TensorFlow are among the most popular frameworks for machine learning. Combining these tools, which are already in use in many organizations today, allows you to take full advantage of their strengths.


Before we get started, though, perhaps you are wondering why I decided to write this book. Good question. There are two reasons. The first is to support the machine learning ecosystem and community by sharing the knowledge, experience, and expertise I have accumulated over the last decade working as a machine learning algorithm researcher, designing and implementing algorithms to run on large-scale data. I have spent most of my career working as a data infrastructure engineer, building infrastructure for large-scale analytics with all sorts of formatting, types, schemas, etc., and integrating knowledge collected from customers, community members, and colleagues who have shared their experience while brainstorming and developing solutions. Our industry can use such knowledge to propel itself forward at a faster rate, by leveraging the expertise of others. While not all of this book’s content will be applicable to everyone, much of it will open up new approaches for a wide array of practitioners.


This brings me to my second reason for writing this book: I want to provide a holistic approach to building end-to-end scalable machine learning solutions that extends beyond the traditional approach. Today, many solutions are customized to the specific requirements of the organization and specific business goals. This will most likely continue to be the industry norm for many years to come. In this book, I aim to challenge the status quo and inspire more creative solutions while explaining the pros and cons of multiple approaches and tools, enabling you to leverage whichever tools are used in your organization and get the best of all worlds. My overall goal is to make it simpler for data and machine learning practitioners to collaborate and understand each other better.


https://www.amazon.com/Scaling-Machine-Learning-Spark-Distributed/dp/1098106822/ref=sr_1_1?keywords=Scaling+Machine+Learning+with+Spark&qid=1684867142&s=books&sr=1-1#:~:text=Who%20Should%20Read,interesting%20and%20accessible.


About the Author

Adi Polak is an open source technologist who believes in communities and education, and their ability to positively impact the world around us. She is passionate about building a better world through open collaboration and technological innovation. As a seasoned engineer and Vice President of Developer Experience at Treeverse, Adi shapes the future of data and ML technologies for hands-on builders. She serves on multiple program committees and acts as an advisor for conferences like Data & AI Summit by Databricks, Current by Confluent, and Scale by the Bay, among others. Adi previously served as a senior manager for Azure at Microsoft, where she helped build advanced analytics systems and modern data architectures. Adi gained experience in machine learning by conducting research for IBM, Deutsche Telekom, and other Fortune 500 companies.

دیدگاه خود را بنویسید
نظرات کاربران (0 دیدگاه)
نظری وجود ندارد.
کتاب های مشابه
Machine Learning
955
Genetic Algorithms and Machine Learning for Programmers
418,000 تومان
Machine Learning
896
Practical Fairness
541,000 تومان
Machine Learning
1,510
The Hundred-Page Machine Learning Book
336,000 تومان
Machine Learning
964
Kubeflow for Machine Learning
451,000 تومان
Machine Learning
1,067
The StatQuest Illustrated Guide to Machine Learning
558,000 تومان
Machine Learning
937
MLOps Engineering at Scale
539,000 تومان
Machine Learning
1,009
Machine Learning For Dummies
768,000 تومان
Machine Learning
395
AI-Assisted Programming for Web and Machine Learning
984,000 تومان
Machine Learning
980
Machine Learning in the Oil and Gas Industry
507,000 تومان
Machine Learning
928
Machine Learning Systems
407,000 تومان
قیمت
منصفانه
ارسال به
سراسر کشور
تضمین
کیفیت
پشتیبانی در
روزهای تعطیل
خرید امن
و آسان
آرشیو بزرگ
کتاب‌های تخصصی
هـر روز با بهتــرین و جــدیــدتـرین
کتاب های روز دنیا با ما همراه باشید
آدرس
پشتیبانی
مدیریت
ساعات پاسخگویی
درباره اسکای بوک
دسترسی های سریع
  • راهنمای خرید
  • راهنمای ارسال
  • سوالات متداول
  • قوانین و مقررات
  • وبلاگ
  • درباره ما
چاپ دیجیتال اسکای بوک. 2024-2022 ©