Approaches to Responsible AI
Patrick Hall, James Curtis, and Parul Pandey

#Machine_Learning
#AI
#Risk_Management
#XGBoost
The past decade has witnessed the broad adoption of artificial intelligence and machine learning (AI/ML) technologies. However, a lack of oversight in their widespread implementation has resulted in some incidents and harmful outcomes that could have been avoided with proper risk management. Before we can realize AI/ML's true benefit, practitioners must understand how to mitigate its risks.
This book describes approaches to responsible AI—a holistic framework for improving AI/ML technology, business processes, and cultural competencies that builds on best practices in risk management, cybersecurity, data privacy, and applied social science. Authors Patrick Hall, James Curtis, and Parul Pandey created this guide for data scientists who want to improve real-world AI/ML system outcomes for organizations, consumers, and the public.
Table of Contents
Part I. Theories and Practical Applications of AI Risk Management
Chapter 1. Contemporary Machine Learning Risk Management
Chapter 2. Interpretable and Explainable Machine Learning
Chapter 3. Debugging Machine Learning Systems for Safety and Performance
Chapter 4. Managing Bias in Machine Learning
Chapter 5. Security for Machine Learning
Part II. Putting AI Risk Management into Action
Chapter 6. Explainable Boosting Machines and Explaining XGBoost
Chapter 7. Explaining a PyTorch Image Classifier
Chapter 8. Selecting and Debugging XGBoost Models
Chapter 9. Debugging a PyTorch Image Classifier
Chapter 10. Testing and Remediating Bias with XGBoost
Chapter 11. Red-Teaming XGBoost
Part III. Conclusion
Chapter 12. How to Succeed in High-Risk Machine Learning
Today, machine learning (ML) is the most commercially viable subdiscipline of artificial intelligence (AI). ML systems are used to make high-risk decisions in employment, bail, parole, lending, security, and in many other high-impact applications throughout the world’s economies and governments. In a corporate setting, ML systems are used in all parts of an organization—from consumer-facing products, to employee assessments, to back-office automation, and more. Indeed, the past decade has brought with it even wider adoption of ML technologies. But it has also proven that ML presents risks to its operators, consumers, and even the general public.
Like all technologies, ML can fail—whether by unintentional misuse or intentional abuse. As of 2023, there have been thousands of public reports of algorithmic discrimination, data privacy violations, training data security breaches, and other harmful incidents. Such risks must be mitigated before organizations, and the public, can realize the true benefits of this exciting technology. Addressing ML’s risks requires action from practitioners. While nascent standards, to which this book aims to adhere, have begun to take shape, the practice of ML still lacks broadly accepted professional licensing or best practices. That means it’s largely up to individual practitioners to hold themselves accountable for the good and bad outcomes of their technology when it’s deployed into the world. Machine Learning for High-Risk Applications will arm practitioners with a solid understanding of model risk management processes and new ways to use common Python tools for training explainable models and debugging them for reliability, safety, bias management, security, and privacy issues.
Who Should Read This Book
This is a mostly technical book for early-to-middle career ML engineers and data scientists who want to learn about the responsible use of ML or ML risk management. The code examples are written in Python. That said, this book probably isn’t for every data scientist and engineer out there coding in Python. This book is for you if you want to learn some model governance basics and update your workflow to accommodate basic risk controls. This book is for you if your work needs to comply with certain nondiscrimination, transparency, privacy, or security standards. (Although we can’t assure compliance or provide legal advice!) This book is for you if you want to train explainable models, and learn to edit and debug them. Finally, this book is for you if you’re concerned that your work in ML may be leading to unintended consequences relating to sociological biases, data privacy violations, security vulnerabilities, or other known problems caused by automated decision making writ large—and you want to do something about it.
Of course, this book may be of interest to others. If you’re coming to ML from a field like physics, econometrics, or psychometrics, this book can help you learn how to blend newer ML techniques with established domain expertise and notions of validity or causality. This book may give regulators or policy professionals some insights into the current state of ML technologies that may be used in an effort to comply with laws, regulations, or standards. Technical risk executives or risk managers may find this book helpful in providing an updated overview of newer ML approaches suited for high-stakes applications. And expert data scientists or ML engineers may find this book educational too, but they may also find it challenges many established data science practices.
What Readers Will Learn
Readers of this book will be exposed to both traditional model risk management and how to blend it with computer security best practices like incident response, bug bounties, and red-teaming, to apply battle-tested risk controls to ML workflows and systems. This book will introduce a number of older and newer explainable models, and explanation techniques that make ML systems even more transparent. Once we’ve set up a solid foundation of highly transparent models, we’ll dig into testing models for safety and reliability. That’s a lot easier when we can see how our model works! We’ll go way beyond quality measurements in holdout data to explore how to apply well-known diagnostic techniques like residual analysis, sensitivity analysis, and benchmarking to new types of ML models. We’ll then progress to structuring models for bias management, testing for bias, and remediating bias from an organizational and technical perspective. Finally, we’ll discuss security for ML pipelines and APIs.
Readers should also be aware that in this first edition we focus on more established ML methods for estimation and decision making. We do not address unsupervised learning, search, recommendation systems, reinforcement learning, and generative AI in great depth. There are several reasons for this:
We do hope to return to these topics in the future and we acknowledge they are affecting billions of people today—positively and negatively. We also note that with a little creativity and elbow grease many of the techniques, risk mitigants, and risk management frameworks in this book can and should be applied to unsupervised models, search, recommendation, and generative AI.
"Machine Learning for High-Risk Applications" is a highly needed book responding to the growing demand for in-depth analysis of predictive models. The book is very practical and gives explicit advice on how to look at different aspects of models, such as model debugging, bias, transparency, and explainability analysis.
— Przemysław Biecek, Ph.D.; Professor, author, GPAI.AI group expert, and leader of the MI2.AI RedTeam
Packed with a winning combination of cutting-edge theory and real-world expertise, this book is a game-changer for anyone grappling with the complexities of AI interpretability, explainability, and security. With expert guidance on managing bias and much more, it's the ultimate guide to mastering the buzzword bonanza of the AI world.
— Mateusz Dymczyk, Software Engineer, Machine Learning at Meta
Saying this book is timely is an understatement. People who do machine learning need a text like this to help them consider all the possible biases and repercussions that arise from the models they create. The best part is that Patrick, James, and Parul do a wonderful job of making this book readable and digestible.
— Aric LaBarr, Ph.D.; Associate Professor of Analytics
This book is a comprehensive review of both social and technical approaches to high-risk AI applications and provides practitioners with useful techniques to bridge their day-to-day work with core concepts in Responsible AI.
— Triveni Gandhi, Ph.D.; Responsible AI Lead, Dataiku
Machine learning applications need to account for fairness, accountability, transparency, and ethics in every industry to be successful. "Machine Learning for High-Risk Applications" lays the foundation for such topics and gives valuable insights that can be applied to various use cases.
— Navdeep Gill, Engineering Manager, H2O.ai
With the ever-growing applications of AI affecting every facet of our lives, it is important to ensure that AI applications, especially the ones that are safety-critical, are developed responsibly. Patrick Hall and team have done a fantastic job in articulating the key aspects and issues in developing safety-critical applications in this book in a pragmatic way.
— Sri Krishnamurthy, CFA, CAP; CEO, QuantUniversity
Machine learning models are very complex in nature, and their development is fraught with pitfalls. Mistakes in this field can cost reputations and millions or even billions of dollars. This book contains must-have knowledge for any machine learning practitioner who wants to design, develop and deploy robust machine learning models that avoid failing like so many other ML endeavors over the past years.
— Szilard Pafka, Ph.D.; Chief Scientist, Epoch
This is an extremely timely book. Practitioners of data science and AI need to seriously consider the real-world impact and consequences of models. The book motivates this and helps them to do so.
— Jorge Silva, Ph.D.; Director of AI/Machine Learning Server, SAS
A refreshingly thoughtful and practical guide to the responsible use of machine learning. This book has the potential to prevent AI incidents and harms before they happen.
— Harsh Singhal, Ph.D.; Senior AI Solution Director, Financial Services, C3.ai
Unlocking the full potential of machine learning and AI goes beyond the mere accuracy of models. With technology advancing at an unprecedented pace and regulations struggling to keep up, this timely and comprehensive guide serves as an indispensable resource for practitioners.
— Ben Steiner, Columbia University
Responsible AI - explained simply.
— Hariom Tatsat, CQF, FRM; VP Barclays Investment Bank
The authors write from a position of both knowledge and experience, providing just the right mix of baseline education in technology and common pitfalls, coverage of regulatory and societal issues, relevant and relatable case studies, and practical guidance throughout.
— Brett Wujek, Ph.D.; Head of AI Product Management, SAS
Patrick Hall is principal scientist at BNH.AI, where he advises Fortune 500 companies and cutting-edge startups on AI risk and conducts research in support of NIST's AI risk management framework. He also serves as visiting faculty in the Department of Decision Sciences at The George Washington School of Business, teaching data ethics, business analytics, and machine learning classes.
Before cofounding BNH, Patrick led H2O.ai's efforts in responsible AI, resulting in one of the world's first commercial applications for explainability and bias mitigation in machine learning. He also held global customer-facing roles and R&D research roles at SAS Institute. Patrick studied computational chemistry at the University of Illinois before graduating from the Institute for Advanced Analytics at North Carolina State University.
Patrick has been invited to speak on topics relating to explainable AI at the National Academies of Science, Engineering, and Medicine, ACM SIG-KDD, and the Joint Statistical Meetings. He has contributed written pieces to outlets like McKinsey.com, O'Reilly Radar, and Thompson Reuters Regulatory Intelligence, and his technical work has been profiled in Fortune, Wired, InfoWorld, TechCrunch, and others.
James Curtis is a quantitative researcher at Solea Energy, where he is focused on using statistical forecasting to further the decarbonization of the US power grid. He previously served as a consultant for financial services organizations, insurers, regulators, and health care providers to help build more equitable AI/ML models. James holds an MS in Mathematics from the Colorado School of Mines.
Parul Pandey has a background in Electrical Engineering and currently works as a Principal Data Scientist at H2O.ai. Prior to this, she was working as a Machine Learning Engineer at Weights & Biases. She is also a Kaggle Grandmaster in the notebooks category and was one of Linkedin’s Top Voices in the Software Development category in 2019. Parul has written multiple articles focused on Data Science and Software development for various publications and mentors, speaks, and delivers workshops on topics related to Responsible AI.









