نام کتاب
97 Things Every Data Engineer Should Know

Collective Wisdom from the Experts

Tobias Macey

Paperback263 Pages
PublisherO'Reilly
Edition1
LanguageEnglish
Year2021
ISBN9781492062417
947
A4271
انتخاب نوع چاپ:
جلد سخت
510,000ت
0
جلد نرم
450,000ت
0
طلق پاپکو و فنر
460,000ت
0
مجموع:
0تومان
کیفیت متن:اورجینال انتشارات
قطع:B5
رنگ صفحات:سیاه و سفید
پشتیبانی در روزهای تعطیل!
ارسال به سراسر کشور

#97_Things

#Data

#Engineering

توضیحات

Take advantage of today's sky-high demand for data engineers. With this in-depth book, current and aspiring engineers will learn powerful real-world best practices for managing data big and small. Contributors from notable companies including Twitter, Google, Stitch Fix, Microsoft, Capital One, and LinkedIn share their experiences and lessons learned for overcoming a variety of specific and often nagging challenges.


Edited by Tobias Macey, host of the popular Data Engineering Podcast, this book presents 97 concise and useful tips for cleaning, prepping, wrangling, storing, processing, and ingesting data. Data engineers, data architects, data team managers, data scientists, machine learning engineers, and software engineers will greatly benefit from the wisdom and experience of their peers.


Topics include:

  • The Importance of Data Lineage - Julien Le Dem
  • Data Security for Data Engineers - Katharine Jarmul
  • The Two Types of Data Engineering and Data Engineers - Jesse Anderson
  • Six Dimensions for Picking an Analytical Data Warehouse - Gleb Mezhanskiy
  • The End of ETL as We Know It - Paul Singman
  • Building a Career as a Data Engineer - Vijay Kiran
  • Modern Metadata for the Modern Data Stack - Prukalpa Sankar
  • Your Data Tests Failed! Now What? - Sam Bail


Table of Contents

Chapter 1. A (Book) Case for Eventual Consistency

Chapter 2. A/B and How to Be

Chapter 3. About the Storage Layer

Chapter 4. Analytics as the Secret Glue for Microservice Architectures

Chapter 5. Automate Your Infrastructure

Chapter 6. Automate Your Pipeline Tests

Chapter 7. Be Intentional About the Batching Model in Your Data Pipelines

Chapter 8. Beware of Silver-Bullet Syndrome

Chapter 9. Building a Career as a Data Engineer

Chapter 10. Business Dashboards for Data Pipelines

Chapter 11. Caution: Data Science Projects Can Turn into the Emperor's New Clothes

Chapter 12. Change Data Capture

Chapter 13. Column Names as Cont racts

Chapter 14. Consensual, Privacy-Aware Data Collection

Chapter 15. Cult ivate Good Working Relationships with Data Consumers

Chapter 16. Data Engineering ! = Spark

Chapter 17. Data Engineering for Autonomy and Ra pid Innovation

Chapter 18. Data Engineering from a Data Scientist's Perspective

Chapter 19. Data Pipeline Design Patterns for Reusability and Extensibility

Chapter 20. Data Quality for Data Engineers

Chapter 21. Data Security for Data Engineers

Chapter 22. Data Validation Is More Than Summary Statistics

Chapter 23. Data Warehouses Are the Past, Present, and Future

Chapter 24. Defining and Managing Messages in Log-Centric Architectures

Chapter 25. Demystify the Source and Illuminate the Data Pipeline

Chapter 26. Develop Communities, Not Just Code

Chapter 27. Effective Data Engineering in the Cloud World

Chapter 28. Embrace the Data Lake Architecture

Chapter 29. Embracing Data Silos

Chapter 30. Engineering Reproducible Data Science Projects

Chapter 31. Five Best Practices for Stable Data Processing

Chapter 32. Focus on Maintainability and Break Up Those ETL Tasks

Chapter 33. Friends Don't Let Friends Do Dual-Writes

Chapter 34. Fundamental Knowledge

Chapter 35. Getting the "Structured" Back into SQL

Chapter 36. Give Data Products a Frontend with Latent Documentation

Chapter 37. How Data Pipelines Evolve

Chapter 38. How to Build Your Data Platform like a Product

Chapter 39. How to Prevent a Data Mutiny

Chapter 40. Know the Value per Byte of Your Data

Chapter 41. Know Your Latencies

Chapter 42. Learn to Use a NoSQL Database, but Not like an RDBMS

Chapter 43. Let the Robots Enforce the Rules

Chapter 44. Listen to Your Users- but Not Too Much

Chapter 45. Low-Cost Sensors and the Quality of Data

Chapter 46. Maintain Your Mechanical Sympathy

Chapter 47. Metadata ~ Data

Chapter 48. Metadata Services as a Core Component of the Data Platform

Chapter 49. Mind the Gap: Your Data Lake Provides No ACID Guarantees

Chapter 50. Modern Metadata for the Modern Data Stack

Chapter 51. Most Data Problems Are Not Big Data Problems

Chapter 52. Moving from Software Engineering to Data Engineering

Chapter 53. Observability for Data Engineers

Chapter 54. Perfect Is the Enemy of Good

Chapter 55. Pipe Dreams

Chapter 56. Preventing the Data Lake Abyss

Chapter 57. Prioritizing User Experience in Messaging Systems

Chapter 58. Privacy Is Your Problem

Chapter 59. QA and All Its Sexiness

Chapter 60. Seven Things Data Engineers Need to Watch Out for in ML Projects

Chapter 61. Six Dimensions for Picking an Analytical Data Warehouse

Chapter 62. Small Files in a Big Data World

Chapter 63. Streaming Is Different from Batch

Chapter 64. Tardy Data

Chapter 65. Tech Should Take a Back Seat for Data Project Success

Chapter 66. Ten Must-Ask Questions for Data-Engineering Projects

Chapter 67. The Data Pipeline Is Not About Speed

Chapter 68. The Dos and Don'ts of Data Engineering

Chapter 69. The End of ETL as We Know It

Chapter 70. The Haiku Approach to Writing Software

Chapter 71. The Hidden Cost of Data Input/Output

Chapter 72. The Holy War Between Proprietary and Open Source Is a Lie

Chapter 73. The Implications of the CAP Theorem

Chapter 74. The Importance of Data Lineage

Chapter 75. The Many Meanings of Missingness

Chapter 76. The Six Words That Will Destroy Your Career

Chapter 77. The Three Invaluable Benefits of Open Source for Testing Data Quality

Chapter 78. The Three Rs of Data Engineering

Chapter 79. The Two Types of Data Engineering and Data Engineers

Chapter 80. The Yin and Yang of Big Data Scalability

Chapter 81. Threading and Concurrency in Data Processing

Chapter 82. Three Important Distributed Programming Concepts

Chapter 83. Time (Semantics) Won't Wait

Chapter 84. Tools Don't Matter, Patterns and Practices Do

Chapter 85. Total Opportunity Cost of Ownership

Chapter 86. Understanding the Ways Different Data Domains Solve Problems

Chapter 87. What Is a Data Engineer? Clue: We're Data Science Enablers

Chapter 88. What Is a Data Mesh, and How Not to Mesh It Up

Chapter 89. What Is Big Data?

Chapter 90. What to Do When You Don't Get Any Credit

Chapter 91. When Our Data Science Team Didn't Produce Value

Chapter 92. When to Avoid the Naive Approach

Chapter 93. When to Be Cautious About Sharing Data

Chapter 94. When to Talk and When to Listen

Chapter 95. Why Data Science Teams Need Generalists, Not Specialists

Chapter 96. With Great Data Comes Great Responsibility

Chapter 97. Your Data Tests Failed! Now What?


About the Author

Tobias Macey hosts the Data Engineering Podcast and Podcast.\_\_init\_\_ where he discusses the tools, topics, and people that comprise the data engineering and Python communities respectively. His experience across the domains of infrastructure, software, cloud, and data engineering allows him to ask informed questions and bring useful context to the discussions. The ongoing focus of his career is to help educate people, through designing and building platforms that power online learning, consulting with companies and investors to understand the possibilities of emerging technologies, and leading teams of engineers to help them grow professionally.

دیدگاه خود را بنویسید
نظرات کاربران (0 دیدگاه)
نظری وجود ندارد.
کتاب های مشابه
Python
311
Data Ingestion with Python Cookbook
699,000 تومان
Data
480
Aerospike: Up and Running
402,000 تومان
Data
435
Apache Airflow Best Practices
367,000 تومان
Data
496
Data Visualization with Microsoft Power BI
618,000 تومان
Data
853
The Self-Service Data Roadmap
476,000 تومان
Data
917
Pro Data Mashup for Power BI
707,000 تومان
Data
742
Analyzing Data with Microsoft Power BI and Power Pivot for Excel
485,000 تومان
Data
667
Data Centre Essentials
427,000 تومان
Data
809
Data-Driven Modeling
489,000 تومان
Azure
1,271
Azure Data Engineering Cookbook
989,000 تومان
قیمت
منصفانه
ارسال به
سراسر کشور
تضمین
کیفیت
پشتیبانی در
روزهای تعطیل
خرید امن
و آسان
آرشیو بزرگ
کتاب‌های تخصصی
هـر روز با بهتــرین و جــدیــدتـرین
کتاب های روز دنیا با ما همراه باشید
آدرس
پشتیبانی
مدیریت
ساعات پاسخگویی
درباره اسکای بوک
دسترسی های سریع
  • راهنمای خرید
  • راهنمای ارسال
  • سوالات متداول
  • قوانین و مقررات
  • وبلاگ
  • درباره ما
چاپ دیجیتال اسکای بوک. 2024-2022 ©