
#Chaos
#Chaos_Engineering
#open_source
#software_developers
#SRE
#DevOps
Most companies work hard to avoid costly failures, but in complex systems a better approach is to embrace and learn from them. Through chaos engineering, you can proactively hunt for evidence of system weaknesses before they trigger a crisis. This practical book shows software developers and system administrators how to plan and run successful chaos engineering experiments.
System weaknesses go beyond your infrastructure, platforms, and applications to include policies, practices, playbooks, and people. Author Russ Miles explains why, when, and how to test systems, processes, and team responses using simulated failures on Game Days. You’ll also learn how to work toward continuous chaos through automation with features you can share across your team and organization.
Audience
This book is for people who are in some way responsible for their code in production. That could mean developers, operations, DevOps, etc. When I say they “are in some way responsible,” I mean that they take responsibility for the availability, stability, and overall robustness of their system as it runs, and may even be part of the group assembled when there is a system outage.
Perhaps you’re a site reliability engineer (SRE) looking to improve the stability of the systems you are responsible for, or you’re working on a team practicing DevOps where everyone owns their code in production.
Whatever your level of responsibility, if you care about how your code runs in production and about the bigger picture of how well production is running for your organization, this book aims to help you meet those challenges.
About the Author
Russ Miles has been working as a chaos engineer at various companies (both startups and enterprises) for the past 3 years. He is part of the Chaos Collective, an expert group founded by Casey Rosenthal who runs 1-day workshops for companies looking to learn about chaos engineering and beginning to establish their own in-house chaos engineering capability. Russ has been teaching technical topics, as well as offering consultancy, worldwide for the past 15 years. His current courses include a popular public 3-day course on chaos engineering that has most recently been run in London. He also speaks internationally. He has founded and continued to build a community around the free and open source Chaos Toolkit and Hub projects.









