Information Science and Statistics
Christopher M. Bishop

ML#
Machine_Learning#
این کتاب، نخستین کتاب در حوزهی شناسایی الگو (Pattern Recognition) است که دیدگاه بیزی (Bayesian) را محور قرار داده است. کتاب، الگوریتمهای استنتاج تقریبی (Approximate Inference) را ارائه میدهد که در موقعیتهایی که پاسخ دقیق امکانپذیر نیست، امکان ارائهی پاسخهای سریع و تقریبی را فراهم میکنند. همچنین از مدلهای گرافی (Graphical Models) برای توصیف توزیعهای احتمالی استفاده میشود؛ در حالیکه در سایر منابع، معمولاً این مدلها در زمینهی یادگیری ماشین به کار گرفته نشدهاند.
این کتاب فرض نمیکند که خواننده پیشزمینهای در مفاهیم شناسایی الگو یا یادگیری ماشین داشته باشد. تنها آشنایی با حساب چندمتغیره و جبر خطی پایه لازم است، و آشنایی مقدماتی با مفاهیم احتمال میتواند مفید باشد، هرچند ضروری نیست، چرا که کتاب شامل یک مقدمهی خودآموز و جامع از تئوری احتمال پایه نیز هست.
شناسایی الگو، ریشه در مهندسی دارد، در حالی که یادگیری ماشین از علوم کامپیوتر نشأت گرفته است. با این حال، این دو حوزه را میتوان دو جنبهی یک حوزهی مشترک دانست که طی دههی گذشته تحولات چشمگیری را تجربه کردهاند. بهویژه، روشهای بیزی از یک موضوع تخصصی محدود، به جریان اصلی بدل شدهاند، و مدلهای گرافی نیز به عنوان چارچوبی عمومی برای توصیف و بهکارگیری مدلهای احتمالی مطرح شدهاند.
کاربرد عملی روشهای بیزی نیز بهواسطهی توسعهی مجموعهای از الگوریتمهای استنتاج تقریبی نظیر بیز تغییراتپذیر (Variational Bayes) و پراکنش انتظاری (Expectation Propagation) بهطور چشمگیری افزایش یافته است. در همین راستا، مدلهای جدید مبتنی بر کرنلها (Kernels) نیز تأثیر قابل توجهی بر الگوریتمها و کاربردهای عملی گذاشتهاند.
این کتاب در عین پوشش تحولات اخیر، مقدمهای جامع بر حوزههای شناسایی الگو و یادگیری ماشین ارائه میدهد. مخاطبان آن، دانشجویان کارشناسی پیشرفته، دانشجویان سال اول دکتری، پژوهشگران و متخصصان حرفهای هستند. همانطور که گفته شد، پیشزمینهای در این دو حوزه مفروض نیست؛ فقط آشنایی با حساب چندمتغیره و جبر خطی لازم است، و آشنایی مقدماتی با احتمال کمککننده خواهد بود.
به دلیل گستردگی موضوعات کتاب، امکان ارائهی فهرست کاملی از منابع وجود ندارد، و تلاش نشده است که انتساب تاریخی دقیقی برای ایدهها ارائه شود. در عوض، هدف کتاب آن است که به منابعی ارجاع دهد که جزئیات بیشتری را نسبت به آنچه در این کتاب ممکن است، ارائه کنند و بتوانند راه ورود به ادبیات گستردهی این حوزه را فراهم سازند. از همین رو، منابع اغلب از میان کتابها و مقالات مروری جدید انتخاب شدهاند، نه منابع اولیه.
This is the first textbook on pattern recognition to present the Bayesian viewpoint. The book presents approximate inference algorithms that permit fast approximate answers in situations where exact answers are not feasible. It uses graphical models to describe probability distributions when no other books apply graphical models to machine learning. No previous knowledge of pattern recognition or machine learning concepts is assumed. Familiarity with multivariate calculus and basic linear algebra is required, and some experience in the use of probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.
Pattern recognition has its origins in engineering, whereas machine learning grew out of computer science. However, these activities can be viewed as two facets of the same field, and together they have undergone substantial development over the past ten years. In particular, Bayesian methods have grown from a specialist niche to become mainstream, while graphical models have emerged as a general framework for describing and applying probabilistic models. Also, the practical applicability of Bayesian methods has been greatly enhanced through the development of a range of approximate inference algorithms such as variational Bayes and expectation propagation. Similarly, new models based on kernels have had significant impact on both algorithms and applications.
This new textbook reflects these recent developments while providing a comprehensive introduction to the fields of pattern recognition and machine learning. It is aimed at advanced undergraduates or first year PhD students, as well as researchers and practitioners, and assumes no previous knowledge of pattern recognition or machine learning concepts. Knowledge of multivariate calculus and basic linear algebra is required, and some familiarity with probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory. Because this book has broad scope, it is impossible to provide a complete list of references, and in particular no attempt has been made to provide accurate historical attribution of ideas. Instead, the aim has been to give references that offer greater detail than is possible here and that hopefully provide entry points into what, in some cases, is a very extensive literature. For this reason, the references are often to more recent textbooks and review articles rather than to original sources.
Table of Contents
1. Introduction
2. Probability Distributions
3. Linear Models for Regression
4. Linear Models for Classification
5. Neural Networks
6. Kernel Methods
7. Sparse Kernel Machines
8. Graphical Models
9. Mixture Models and EM
10. Approximate Inference
11. Sampling Methods
12. Continuous Latent Variables
13. Sequential Data
14. Combining Models
Appendix A. Data Sets
Appendix B. Probability Distributions
Appendix C. Properties of Matrices
Appendix D. Calculus of Variations
Appendix E. Lagrange Multipliers
From the reviews:
"This beautifully produced book is intended for advanced undergraduates, PhD students, and researchers and practitioners, primarily in the machine learning or allied areas...A strong feature is the use of geometric illustration and intuition...This is an impressive and interesting book that might form the basis of several advanced statistics courses. It would be a good choice for a reading group." John Maindonald for the Journal of Statistical Software
"In this book, aimed at senior undergraduates or beginning graduate students, Bishop provides an authoritative presentation of many of the statistical techniques that have come to be considered part of ‘pattern recognition’ or ‘machine learning’. … This book will serve as an excellent reference. … With its coherent viewpoint, accurate and extensive coverage, and generally good explanations, Bishop’s book is a useful introduction … and a valuable reference for the principle techniques used in these fields." (Radford M. Neal, Technometrics, Vol. 49 (3), August, 2007)
"This book appears in the Information Science and Statistics Series commissioned by the publishers. … The book appears to have been designed for course teaching, but obviously contains material that readers interested in self-study can use. It is certainly structured for easy use. … For course teachers there is ample backing which includes some 400 exercises. … it does contain important material which can be easily followed without the reader being confined to a pre-determined course of study." (W. R. Howard, Kybernetes, Vol. 36 (2), 2007)
"Bishop (Microsoft Research, UK) has prepared a marvelous book that provides a comprehensive, 700-page introduction to the fields of pattern recognition and machine learning. Aimed at advanced undergraduates and first-year graduate students, as well as researchers and practitioners, the book assumes knowledge of multivariate calculus and linear algebra … . SummingUp: Highly recommended. Upper-division undergraduates through professionals." (C. Tappert, CHOICE, Vol. 44 (9), May, 2007)
"The book is structured into 14 main parts and 5 appendices. … The book is aimed at PhD students, researchers and practitioners. It is well-suited for courses on machine learning, statistics, computer science, signal processing, computer vision, data mining, and bio-informatics. Extensive support is provided for course instructors, including more than 400 exercises, lecture slides and a great deal of additional material available at the book’s web site … ." (Ingmar Randvee, Zentralblatt MATH, Vol. 1107 (9), 2007)
"This new textbook by C. M. Bishop is a brilliant extension of his former book ‘Neural Networks for Pattern Recognition’. It is written for graduate students or scientists doing interdisciplinary work in related fields. … In summary, this textbook is an excellent introduction to classical pattern recognition and machine learning (in the senseof parameter estimation). A large number of very instructive illustrations adds to this value." (H. G. Feichtinger, Monatshefte für Mathematik, Vol. 151 (3), 2007)
"Author aims this text at advanced undergraduates, beginning graduate students, and researchers new to machine learning and pattern recognition. … Pattern Recognition and Machine Learning provides excellent intuitive descriptions and appropriate-level technical details on modern pattern recognition and machine learning. It can be used to teach a course or for self-study, as well as for a reference. … I strongly recommend it for the intended audience and note that Neal (2007) also has given this text a strong review to complement its strong sales record." (Thomas Burr, Journal of the American Statistical Association, Vol. 103 (482), June, 2008)
"This accessible monograph seeks to provide a comprehensive introduction to the fields of pattern recognition and machine learning. It presents a unified treatment of well-known statistical pattern recognition techniques. … The book can be used by advanced undergraduates and graduate students … . The illustrative examples and exercises proposed at the end of each chapter are welcome … . The book, which provides several new views, developments and results, is appropriate for both researchers and students who work in machine learning … ." (L. State, ACM Computing Reviews, October, 2008)
"Chris Bishop’s … technical exposition that is at once lucid and mathematically rigorous. … In more than 700 pages of clear, copiously illustrated text, he develops a common statistical framework that encompasses … machine learning. … it is a textbook, with a wide range of exercises, instructions to tutors on where to go for full solutions, and the color illustrations that have become obligatory in undergraduate texts. … its clarity and comprehensiveness will make it a favorite desktop companion for practicing data analysts." (H. Van Dyke Parunak, ACM Computing Reviews, Vol. 49 (3), March, 2008)
The dramatic growth in practical applications for machine learning over the last ten years has been accompanied by many important developments in the underlying algorithms and techniques. For example, Bayesian methods have grown from a specialist niche to become mainstream, while graphical models have emerged as a general framework for describing and applying probabilistic techniques. The practical applicability of Bayesian methods has been greatly enhanced by the development of a range of approximate inference algorithms such as variational Bayes and expectation propagation, while new models based on kernels have had a significant impact on both algorithms and applications.
This completely new textbook reflects these recent developments while providing a comprehensive introduction to the fields of pattern recognition and machine learning. It is aimed at advanced undergraduates or first-year PhD students, as well as researchers and practitioners. No previous knowledge of pattern recognition or machine learning concepts is assumed. Familiarity with multivariate calculus and basic linear algebra is required, and some experience in the use of probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.
The book is suitable for courses on machine learning, statistics, computer science, signal processing, computer vision, data mining, and bioinformatics. Extensive support is provided for course instructors, including more than 400 exercises, graded according to difficulty. Example solutions for a subset of the exercises are available from the book web site, while solutions for the remainder can be obtained by instructors from the publisher. The book is supported by a great deal of additional material, and the reader is encouraged to visit the book web site for the latest information.
Christopher M. Bishop is Deputy Director of Microsoft Research Cambridge, and holds a Chair inComputer Science at the University of Edinburgh. He is a Fellow of Darwin College Cambridge, a Fellow of the Royal Academy of Engineering, and a Fellow of the Royal Society of Edinburgh. His previous textbook "Neural Networks for Pattern Recognition" has been widely adopted.
Coming soon:
Chris Bishop is a Technical Fellow at Microsoft and is the Director of Microsoft Research AI4Science. He is a Fellow of Darwin College, Cambridge, a Fellow of the Royal Academy of Engineering, a Fellow of the Royal Society of Edinburgh, and a Fellow of the Royal Society of London. He is a keen advocate of public engagement in science, and in 2008 he delivered the prestigious Royal Institution Christmas Lectures, established in 1825 by Michael Faraday, and broadcast on prime-time national television. Chris was a founding member of the UK AI Council and was also appointed to the Prime Minister’s Council for Science and Technology.









