Philip Kiely

#Inference
#Engineering
#AI
#Hardware
#Software
Inference is the most valuable category in AI, but inference engineering is still in its infancy. Inference engineers work across the stack from CUDA to Kubernetes in pursuit of faster, less expensive, more reliable serving of generative AI models in production.
Table of Contents
Chapter 0: Inference
Chapter 1: Prerequisites
Chapter 2: Models
Chapter 3: Hardware
Chapter 4: Software
Chapter 5: Techniques
Chapter 6: Modalities
Chapter 7: Production
Appendix A: Inference Glossary
Appendix B: Recommended Reading
About the Author
Philip Kiely leads Developer Relations at Baseten. Prior to joining Baseten in 2022, he worked across software engineering and technical writing for a variety of startups.









