Build structured NLP solutions with custom components and models powered by spacy-llm
Déborah Mesquita, Duygu Altinok

#spaCy
#NLP
#Python
#LLM
#spaCy-LLM
#FastAPI
#DVC
#NER
کشف روشهای پیشرفته کار با spaCy، شامل طراحی پایپلاینهای سفارشی، یکپارچهسازی با مدلهای زبانی بزرگ (LLM) و آموزش مدلها برای ساخت راهکارهای پردازش زبان طبیعی (NLP) به شکلی مؤثر و کارآمد
کتاب "Mastering spaCy – نسخه دوم" راهنمای جامع شما برای ساخت برنامههای پیچیده پردازش زبان طبیعی با استفاده از اکوسیستم spaCy است. این نسخه بازنگریشده، جدیدترین پیشرفتهای حوزه NLP را پوشش میدهد و شامل فصلهایی تازه درباره مدلهای زبانی بزرگ (LLM) با spaCy-LLM، یکپارچهسازی با مدلهای ترنسفورمر، و مدیریت پایپلاینهای کامل با Weasel میباشد.
در این نسخه، یاد میگیرید چگونه با استفاده از spaCy-LLM، وظایف NLP را با کمک مدلهای زبانی بزرگ تقویت کنید، پایپلاینها را با Weasel مدیریت کنید و spaCy را با کتابخانههایی مانند Streamlit، FastAPI و DVC ترکیب نمایید. از آموزش پایپلاینهای سفارشی NER گرفته تا تحلیل احساسات در پستهای Reddit، خواننده با مباحث پیشرفتهای مانند طبقهبندی متن و هممرجعسازی نیز آشنا میشود.
کتاب با مفاهیم پایهای NLP مانند توکنیزهکردن، تشخیص موجودیتهای نامدار (NER)، و تجزیه وابستگیها آغاز میشود و بهتدریج به موضوعاتی پیشرفته مثل ساخت کامپوننتهای سفارشی، آموزش مدلهای تخصصی برای دامنههای خاص، و پیادهسازی پایپلاینهای مقیاسپذیر NLP میپردازد.
در پایان کتاب، با بهرهگیری از مثالهای کاربردی، توضیحات شفاف و نکات عملی، توانایی ساخت پایپلاینهای قدرتمند NLP و یکپارچهسازی آنها با اپلیکیشنهای وب را خواهید داشت، تا بتوانید راهکارهای کامل و عملیاتی بسازید.
Discover how to master advanced spaCy techniques, including custom pipelines, LLM integration, and model training, to build NLP solutions efficiently and effectively
Mastering spaCy, Second Edition is your comprehensive guide to building sophisticated NLP applications using the spaCy ecosystem. This revised edition embraces the latest advancements in NLP, featuring new chapters on Large Language Models with spaCy-LLM, transformers integration, and end-to-end workflow management with Weasel.
With this new edition you’ll learn to enhance NLP tasks using LLMs with spaCy-llm, manage end-to-end workflows using Weasel and integrating spaCy with third-party libraries like Streamlit, FastAPI, and DVC. From training custom named entity recognition (NER) pipelines to categorizing emotions in Reddit posts, readers will explore advanced topics like text classification and coreference resolution. This book takes you on a journey through spaCy’s capabilities, starting with the fundamentals of NLP, such as tokenization, named entity recognition, and dependency parsing. As you progress, you’ll delve into advanced topics like creating custom components, training domain-specific models, and building scalable NLP workflows.
By end of the book, through practical examples, clear explanations, tips and tricks you will be empowered to build robust NLP pipelines and integrate them with web applications to build end-to-end solutions.
This book is tailored for NLP engineers, machine learning developers, and LLM engineers looking to build production-grade language processing solutions. While primarily targeting professionals working with language models and NLP pipelines, it's also valuable for software engineers transitioning into NLP development. Basic Python programming knowledge and familiarity with NLP concepts is recommended to leverage spaCy's latest capabilities.
Déborah is a data science consultant and writer. With a BSc in Computer Science from UFPE, one of Brazil's top computer science programs, she brings a diversified skill set refined through hands-on experience with various technologies. Déborah has thrived in different data science projects, including roles such as lead data scientist and technical contributor for respected publications. Her ability to translate complex concepts into simple language, coupled with her quick learning and broad vision, make her an effective educator. Actively engaged in community initiatives, she works to ensure equitable access to knowledge, reflecting her belief that technology is not a panacea, but a powerful tool for societal improvement when used for that purpose.
Duygu Altinok is a senior NLP engineer with 12 years of experience in almost all areas of NLP including search engine technology, speech recognition, text analytics, and conversational AI. She authored several publications in the NLP area at conferences such as LREC and CLNLP. She also enjoys working on open-source projects and is a contributor to the spaCy library. Duygu earned her undergraduate degree in Computer Engineering from METU, Ankara in 2010 and later earned her Master's degree in Mathematics from Bilkent University, Ankara in 2012. She is currently a senior engineer at German Autolabs with a focus on conversational AI for voice assistants. Originally from Istanbul, Duygu currently resides in Berlin, DE with her cute dog Adele.









