Achieving Fair and Secure Data Models
Aileen Nielsen

#AI
#machine_learning
#data_security
Fairness is becoming a paramount consideration for data scientists. Mounting evidence indicates that the widespread deployment of machine learning and AI in business and government is reproducing the same biases we're trying to fight in the real world. But what does fairness mean when it comes to code? This practical book covers basic concerns related to data security and privacy to help data and AI professionals use code that's fair and free of bias.
Many realistic best practices are emerging at all steps along the data pipeline today, from data selection and preprocessing to closed model audits. Author Aileen Nielsen guides you through technical, legal, and ethical aspects of making code fair and secure, while highlighting up-to-date academic research and ongoing legal developments related to fairness and algorithms.
Welcome to Practical Fairness. I wrote this book because data scientists and machine learning engineers are increasingly aware of the fairness implications of their work but are not adequately empowered to do anything about their concerns. Academic research on mathematical solutions to fairness concerns is flourishing, and myriad open source options are available thanks to both academic researchers and technology companies sharing resources. However, delving into the topic in a practical and concrete way remains difficult for the beginner, and best practices have not yet emerged in most industries to address even the most basic concerns. This book’s aim is to be an accessible overview for beginners in this field with actionable fairness advice.
Goals of This Book
This book will help practicing data scientists and technologists get their feet wet in the world of fairness. The goal is that after reading this book, you can actively pursue fairness in your own work. Fairness doesn’t have a one-size-fits-all solution, but you should be able to:
In this book I exclusively use Python examples and focus on the easiest interfaces available via open source options for implementing relevant fairness methods. I chose this approach because Python benefits from a large degree of open source work in the fairness domain. However, a good deal of work in other languages unfortunately had to be ignored, including in Java, R, and MATLAB code. Also, code bases with organizational sponsors and with a larger breadth of tools were favored over smaller code bases or code bases maintained by just a few individuals. This also meant that some very interesting and high-quality work was omitted. You should be aware that the goal of this book is far more to help you get conceptually organized and that the packages highlighted here are just one set of tools, not necessarily the best or only tools. The fairness toolbox continues to grow rapidly, and there is every reason to expect more tools and code bases to develop over time. The selections in this book represent just one snapshot of convenient APIs.
Fairness in machine learning and in the technology sector remains an active struggle, an ongoing social concern, and an interesting engineering problem. We need legal, economic, and social solutions as well as technical ones. Toward that end, 50% of the royalties earned on this book will be donated to the American Civil Liberties Union, an organization that has relentlessly pursued fairness for a hundred years. The ACLU is actively working to secure fundamental rights to privacy and fairness in the era of algorithms. Readers should be aware that the ACLU played no part in the writing of this book, has not reviewed it, and has not endorsed it. Nevertheless my hope is that this book will reinforce and support the ACLU’s important work in more ways than one.
The other 50% of royalties earned on this book will be donated to Mercy for Animals, an organization dedicated to radically expanding our definitions of fairness to include the well-being of animals and Earth. Mercy for Animals takes a practical approach to addressing devastating cruelty and unfairness practiced in the fundamental industries that bring food to our table. Thus, my hope is that this book’s royalties can contribute to our society’s ongoing development of a fairer and more inclusive outlook.
Aileen Nielsen is a software engineer who has analyzed data in a variety of settings from a physics laboratory to a political campaign to a healthcare startup. She also has a law degree and splits her time between a deep learning startup and research as a Fellow in Law and Technology at ETH Zurich. She has given talks around the world on fairness issues in data and modeling.









