Colby T. Ford

#Genomics
#Azure
#Cloud
#Microsoft
#bioinformatics
This practical guide bridges the gap between general cloud computing architecture in Microsoft Azure and scientific computing for bioinformatics and genomics. You'll get a solid understanding of the architecture patterns and services that are offered in Azure and how they might be used in your bioinformatics practice. You'll get code examples that you can reuse for your specific needs. And you'll get plenty of concrete examples to illustrate how a given service is used in a bioinformatics context.
You'll also get valuable advice on how to:
And more
Genomics in the Azure Cloud is a book that attempts to bridge the gap between general cloud computing architecture in Microsoft Azure and scientific computing for bioinformatics and genomics. This is meant to be a practical guide to architecting performant solutions in Azure for a variety of bioinformatics use cases.
Who Should Read This Book
This book is written for people with experience in bioinformatics and genomics but not necessarily cloud computing. This book can also be valuable for those who are cloud engineers looking to get a better understanding of how cloud architecture should work for bioinformatics.
In many university programs, bioinformatics students are exposed to technical concepts to analyze genetic data. Then, they may also be taught how to use on-premises high-performance computing (HPC) environments (clusters) to run larger workloads. These skills will be valuable when transitioning to the cloud. So don’t worry, all the skills you learned in grad school aren’t for naught.
Many online resources for cloud-enabled bioinformatics focus only on human genomics and, more specifically, how to use the Broad Institute’s set of tools (such as GATK, Picard, Cromwell, etc.) in the cloud. I’d like to generalize this book a bit more since, in my experience, people use bits and pieces of software from all over the place, combining them to make a patchwork of tools to fit their specific needs. Also, as an infectious disease guy, I have used lots of different tools in the past that are never mentioned by the human-focused online resources. So all that to say, this book will not be focused on any specific species or suite of tools. I want this book to be useful to scientists no matter if they work on humans or viruses or elephants or plants or whatever…
Genomics in the Azure Cloud will teach readers how to use enterprise platform services to scale their bioinformatics workloads with ease. Plus, the book starts off by covering how to organize and query the vast amounts of genomics data an organization may have by building a genomics data lake and accompanying data warehouse.
The code examples in this book should serve as a reference point or a practical recommendation of an approach that has worked for me in the past. Each of the code snippets can be thought of as part of a recipe that can be edited to fit your specific needs.
After reading this book, you should have a solid understanding of the services that are offered in Azure and how they might be used in your bioinformatics practice. You’ll be familiar with some standard architecture patterns, including how to store and organize data and how to analyze it at scale. You’ll also understand how to cloudify your organization’s existing bioinformatics pipelines by converting the workflows to work in compute services in Azure.
Some technical things that may be helpful to know:
Dr. Colby T. Ford is a professional AI cloud architect, data scientist, and computational biologist who uses machine learning and distributed computing to solve problems in the fields of infectious diseases and human genomics. For the last 8+ years, he has been consulting for companies across industries, leading the conversation for digital transformation using artificial intelligence and cloud computing. He currently serves as the Principal of Life Sciences at BlueGranite, a top-tier Microsoft partner, and focuses on building cloud-based bioinformatics solutions in the Azure cloud. In academia, his research includes the use of large-scale machine learning architecture in the study of infectious disease genomics and rare human diseases. In addition to his consulting and academic career, Dr. Ford is a co-founder of a digital health startup that focuses on the use of wearable devices to help study neurological disorders.
Given Dr. Ford's interdisciplinary education background and parallel experience in industry and academia, he has a unique viewpoint and approach to effectively solve genomics research problems with cutting-edge technologies previously only used in industry blended with methods previously only seen in academia.









