Praise for Natural Language Processing with Transformers
Pretrained transformer language models have taken the NLP world by storm, while libraries such as Transformers have made them much easier to use. Who better to teach you how to leverage the latest breakthroughs in NLP than the creators of said library? Natural Language Processing with Transformers is a tour de force, reflecting the deep subject matter expertise of its authors in both engineering and research. It is the rare book that offers both substantial breadth and depth of insight and deftly mixes research advances with real-world applications in an accessible way. The book gives informed coverage of the most important methods and applications in current NLP, from multilingual to efficient models and from question answering to text generation. Each chapter provides a nuanced overview grounded in rich code examples that highlights best practices as well as practical considerations and enables you to put research-focused models to impactful real-world use. Whether youre new to NLP or a veteran, this book will improve your understanding and fast-track your development and deployment of state-of-the-art models .
Sebastian Ruder, Google DeepMind
Transformers have changed how we do NLP, and Hugging Face has pioneered how we use transformers in product and research. Lewis Tunstall, Leandro von Werra, and Thomas Wolf from Hugging Face have written a timely volume providing a convenient and hands-on introduction to this critical topic. The book offers a solid conceptual grounding of transformer mechanics, a tour of the transformer menagerie, applications of transformers, and practical issues in training and bringing transformers to production. Having read chapters in this book, with the depth of its content and lucid presentation, I am confident that this will be the number one resource for anyone interested in learning transformers, particularly for natural language processing.
Delip Rao, Author of Natural Language Processing and Deep Learning with PyTorch
Complexity made simple. This is a rare and precious book about NLP, transformers, and the growing ecosystem around them, Hugging Face. Whether these are still buzzwords to you or you already have a solid grasp of it all, the authors will navigate you with humor, scientific rigor, and plenty of code examples into the deepest secrets of the coolest technology around. From off-the-shelf pretrained to from-scratch custom models, and from performance to missing labels issues, the authors address practically every real-life struggle of a ML engineer and provide state-of-the-art solutions, making this book destined to dictate the standards in the field for years to come.
Luca Perrozzi, PhD, Data Science and Machine Learning Associate Manager at Accenture
Natural Language Processing with Transformers
by Lewis Tunstall , Leandro von Werra , and Thomas Wolf
Copyright 2022 Lewis Tunstall, Leandro von Werra, and Thomas Wolf. All rights reserved.
Printed in the United States of America.
Published by OReilly Media, Inc. , 1005 Gravenstein Highway North, Sebastopol, CA 95472.
OReilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com.
- Acquisitions Editor: Rebecca Novack
- Development Editor: Melissa Potter
- Production Editor: Katherine Tozer
- Copyeditor: Rachel Head
- Proofreader: Kim Cofer
- Indexer: Potomac Indexing, LLC
- Interior Designer: David Futato
- Cover Designer: Karen Montgomery
- Illustrator: Christa Lanz
- February 2022: First Edition
Revision History for the First Edition
- 2022-01-26: First Release
See http://oreilly.com/catalog/errata.csp?isbn=9781098103248 for release details.
The OReilly logo is a registered trademark of OReilly Media, Inc. Natural Language Processing with Transformers, the cover image, and related trade dress are trademarks of OReilly Media, Inc.
The views expressed in this work are those of the authors and do not represent the publishers views. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
978-1-098-10324-8
[LSI]
Foreword
A miracle is taking place as you read these lines: the squiggles on this page are transforming into words and concepts and emotions as they navigate their way through your cortex. My thoughts from November 2021 have now successfully invaded your brain. If they manage to catch your attention and survive long enough in this harsh and highly competitive environment, they may have a chance to reproduce again as you share these thoughts with others. Thanks to language, thoughts have become airborne and highly contagious brain germsand no vaccine is coming.
Luckily, most brain germs are harmless, and a few are wonderfully useful. In fact, humanitys brain germs constitute two of our most precious treasures: knowledge and culture. Much as we cant digest properly without healthy gut bacteria, we cannot think properly without healthy brain germs. Most of your thoughts are not actually yours: they arose and grew and evolved in many other brains before they infected you. So if we want to build intelligent machines, we will need to find a way to infect them too.
The good news is that another miracle has been unfolding over the last few years: several breakthroughs in deep learning have given birth to powerful language models. Since you are reading this book, you have probably seen some astonishing demos of these language models, such as GPT-3, which given a short prompt such as a frog meets a crocodile can write a whole story. Although its not quite Shakespeare yet, its sometimes hard to believe that these texts were written by an artificial neural network. In fact, GitHubs Copilot system is helping me write these lines: youll never know how much I really wrote.
The revolution goes far beyond text generation. It encompasses the whole realm of natural language processing (NLP), from text classification to summarization, translation, question answering, chatbots, natural language understanding (NLU), and more. Wherever theres language, speech or text, theres an application for NLP. You can already ask your phone for tomorrows weather, or chat with a virtual help desk assistant to troubleshoot a problem, or get meaningful results from search engines that seem to truly understand your query. But the technology is so new that the best is probably yet to come.
Like most advances in science, this recent revolution in NLP rests upon the hard work of hundreds of unsung heroes. But three key ingredients of its success do stand out:
The transformer is a neural network architecture proposed in 2017 in a groundbreaking paper called Attention Is All You Need, published by a team of Google researchers. In just a few years it swept across the field, crushing previous architectures that were typically based on recurrent neural networks (RNNs). The Transformer architecture is excellent at capturing patterns in long sequences of data and dealing with huge datasetsso much so that its use is now extending well beyond NLP, for example to image processing tasks.