Human Supervision from Annotation to Data Science
Anthony Sarkis

#AI
#ML
#Machine_Learning
#Data_Science
Your training data has as much to do with the success of your data project as the algorithms themselves because most failures in AI systems relate to training data. But while training data is the foundation for successful AI and machine learning, there are few comprehensive resources to help you ace the process.
In this hands-on guide, author Anthony Sarkis--lead engineer for the Diffgram AI training data software--shows technical professionals, managers, and subject matter experts how to work with and scale training data, while illuminating the human side of supervising machines. Engineering leaders, data engineers, and data science professionals alike will gain a solid understanding of the concepts, tools, and processes they need to succeed with training data.
Table of Contents
Chapter 1. Training Data Introduct ion
Chapter 2. Getting Up and Running
Chapter 3. Schema
Chapter 4. Data Engineering
Chapter 5. Workflow
Chapter 6. Theories, Concepts, and Maintenance
Chapter 7. Al Transformation and Use Cases
Chapter 8. Automation
Chapter 9. Case Studies and Stories
With this book, you'll learn how to:
More About This Book
Data is all around us—videos, images, text, documents, as well as geospatial, multi-dimensional data, and more. Yet, in its raw form, this data is of little use to supervised machine learning (ML) and artificial intelligence (AI). How do we make use of this data? How do we record our intelligence so it can be reproduced through ML and AI? The answer is the art of training data—the discipline of making raw data useful.
\
In this book you will learn:
Who Should Read This Book?
This book is a foundational overview of training data. It’s ideally suited to those who are totally new, or just getting started, with training data.
For intermediate practitioners, the later chapters provide unique value and insights that can’t be found anywhere else; in a nutshell, insider knowledge. I will highlight specific areas of interest for subject matter experts, workflow managers, directors of training data, data engineers, and data scientists.
Computer science (CS) knowledge is not required. Knowing CS, machine learning, or data science will make more sections of the book accessible. I strive to make this book maximally accessible to data annotators, including subject matter experts, because they play a key part in training data, including supervising the system.
Anthony Sarkis is the lead engineer on Diffgram Training Data Management software and founder of Diffgram Inc. Prior to that he was a Software Engineer at Skidmore, Owings & Merrill and co-founded DriveCarma.ca.









