A Holistic Approach to LLMs
Suhas Pai

#LLM
#Large_Language_Model
#RAG
مدلهای زبانی بزرگ (LLMs) خود را بهعنوان ابزارهایی قدرتمند برای حل طیف گستردهای از وظایف اثبات کردهاند، و شرکتها نیز به این موضوع توجه نشان دادهاند. اما گذار از نمونههای آزمایشی و دموها به برنامههای کاربردی واقعی، فرآیندی دشوار و پرچالش است. این کتاب با ارائه ابزارها، تکنیکها و نقشههای راه عملی، به متخصصان کمک میکند این شکاف را پر کرده و محصولات مفیدی بر پایه مدلهای زبانی توسعه دهند.
سوحاس پای، پژوهشگر باسابقه یادگیری ماشین، در این کتاب راهکارهایی عملی برای استفاده از LLMها در مسائل واقعی و مدیریت خطاهای رایج ارائه میدهد. در این مسیر، شما سفری جامع به درون ساختار مدلهای زبانی خواهید داشت، تکنیکهای متنوع شخصیسازی مانند فاینتیونینگ (fine-tuning) را خواهید آموخت، با الگوهای ساخت اپلیکیشن مانند RAG (تولید تقویتشده با بازیابی) و agentها آشنا میشوید، و موارد دیگر.
این کتاب برای طیف وسیعی از خوانندگان طراحی شده است، از جمله:
بخش زیادی از محتوای این کتاب حاصل تجربههای عملی نویسنده در کار با LLMهاست. بنابراین حتی اگر پژوهشگری باتجربه باشید، نکات ارزشمندی خواهید یافت. بههمین ترتیب، اگر آشنایی محدودی با دنیای هوش مصنوعی دارید، باز هم میتوانید مبانی این فناوری را بهخوبی درک کنید.
پیشنیازهای این کتاب تنها آشنایی با زبان برنامهنویسی Python و اصول پایه یادگیری ماشین و یادگیری عمیق است. در مواقع ضروری، منابع تکمیلی برای تقویت این پیشنیازها معرفی شدهاند.
Large language models (LLMs) have proven themselves to be powerful tools for solving a wide range of tasks, and enterprises have taken note. But transitioning from demos and prototypes to full-fledged applications can be difficult. This book helps close that gap, providing the tools, techniques, and playbooks that practitioners need to build useful products that incorporate the power of language models.
Experienced ML researcher Suhas Pai offers practical advice on harnessing LLMs for your use cases and dealing with commonly observed failure modes. You’ll take a comprehensive deep dive into the ingredients that make up a language model, explore various techniques for customizing them such as fine-tuning, learn about application paradigms like RAG (retrieval-augmented generation) and agents, and more.
Who This Book Is For
This book is intended for a broad audience, including software engineers transitioning to AI application development, machine learning practitioners and scientists, and product managers. Much of the content in this book is borne from my own experiments with LLMs, so even if you are an experienced scientist, I expect you will find value in it. Similarly, even if you have very limited exposure to the world of AI, I expect you will still find the book useful for understanding the fundamentals of this technology.
The only prerequisites for this book are knowledge of Python coding and an understanding of basic machine learning and deep learning principles. Where required, I provide links to external resources that you can use to sharpen or develop your prerequisites.
How This Book Is Structured
The book is divided into 3 parts with a total of 13 chapters. The first part deals with understanding the ingredients of a language model. I strongly feel that even though you may never train a language model from scratch yourself, knowing what goes into making it is crucial. The second part discusses various ways to harness language models, be it by directly prompting the model, or by fine-tuning it in various ways. It also addresses limitations such as hallucinations and reasoning constraints, along with methods to mitigate these issues. Finally, the third part of the book deals with application paradigms like retrieval augmented generation (RAG) and agents, positioning LLMs within the broader context of an entire software system.
What This Book Is Not About
To keep the book at a reasonable length, certain topics were deemed out of scope. I have taken care to not cover topics that I am not confident will stand the test of time. This field is very fast moving, so writing a book that maintains its relevance over time is extremely challenging.
This book focuses only on English-language LLMs and leaves out discussion on multilingual models for the most part. I also disagree with the notion of mushing all the non-English languages of the world under the “multilingual” banner. Every language has its own nuances and deserves its own book.
This book also doesn’t cover multimodal models. New models are increasingly multimodal, i.e., a single model supports multiple modalities like text, image, video, speech, etc. However, text remains the most important modality and is the binding substrate in these models. Thus, reading this book will still help you prepare for the multimodal future.
This book does not focus on theory or go too deep into math. There are plenty of other books that cover that, and I have generously linked to them where needed. This book contains minimal math equations and instead focuses on building intuitions.
This book contains only a rudimentary introduction to reasoning models, the latest LLM paradigm. At the time of the book’s writing, reasoning models are still in their infancy, and the jury is still out on which techniques will prove to be most effective.
Table of Contents
Part I. LLM Ingredients
Chapter 1. Introduction
Chapter 2. Pre-Training Data
Chapter 3. Vocabulary and Tokenization
Chapter 4. Architectures and Learning Objectives
Part II. Utilizing LLMs
Chapter 5. Adapting LLMs to Your Use Case
Chapter 6. Fine-Tuning
Chapter 7. Advanced Fine-Tuning Techniques
Chapter 8. Alignment Training and Reasoning
Chapter 9. Inference Optimization
Part III. LLM Application Paradigms
Chapter 10. Interfacing LLMs with External Tools
Chapter 11. Representation Learning and Embeddings
Chapter 12. Retrieval-Augmented Generation
Chapter 13. Design Patterns and System Architecture
Suhas Pai is an experienced machine learning researcher, having worked in the tech industry for over a decade. He is the co-founder, CTO, and ML Research Lead at Hudson Labs, a Y-Combinator backed AI & Fintech startup, since 2020. At Hudson Labs, Suhas invented several novel techniques in the area of domain-adapted LLMs, text ranking, and representation learning, that fully power the core features of Hudson Lab's products. He has contributed to the development of several open-source LLMs, including being the co-lead of the Privacy working group at BigScience, as part of the BLOOM LLM project.
Suhas is active in the ML community, being Chair of the TMLS (Toronto Machine Learning Summit) conference since 2021. He is also a frequent speaker at AI conferences worldwide, and hosts regular seminars discussing the latest research in the field of NLP.









