Transfer Learning for Natural Language Processing
PAUL AZUNRE
To comment go to liveBook
Manning
Shelter Island
For more information on this and other Manning titles go to
www.manning.com
Copyright
For online information and ordering of these and other Manning books, please visit www.manning.com. The publisher offers discounts on these books when ordered in quantity.
For more information, please contact
Special Sales Department
Manning Publications Co.
20 Baldwin Road
PO Box 761
Shelter Island, NY 11964
Email: orders@manning.com
2021 by Manning Publications Co. All rights reserved.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.
Recognizing the importance of preserving what has been written, it is Mannings policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine.
| Manning Publications Co. 20 Baldwin Road Technical PO Box 761 Shelter Island, NY 11964 |
Development editor: | Susan Ethridge |
Technical development editor: | Al Krinker |
Review editor: | Aleksandar Dragosavljevi |
Production editor: | Keri Hales |
Copy editor: | Pamela Hunt |
Proofreader: | Melody Dolab |
Technical proofreader: | Ariel Gamio |
Typesetter: | Dennis Dalinnik |
Cover designer: | Marija Tudor |
ISBN: 9781617297267
dedication
This book is dedicated to my wife, Diana, son, Khaya, and puppy, Lana, who shared the journey of writing it with me.
front matter
preface
Over the past couple of years, it has become increasingly difficult to ignore the breakneck speed at which the field of natural language processing (NLP) has been progressing. Over this period, you have likely been bombarded with news articles about trending NLP models such as ELMo, BERT, and more recently GPT-3. The excitement around this technology is warranted, because these models have enabled NLP applications we couldnt imagine would be practical just three years prior, such as writing production code from a mere description of it, or the automatic generation of believable poetry and blogging.
A large driver behind this advance has been the focus on increasingly sophisticated transfer learning techniques for NLP models. Transfer learning is an increasingly popular and exciting paradigm in NLP because it enables you to adapt or transfer the knowledge acquired from one scenario to a different scenario, such as a different language or task. It is a big step forward for the democratization of NLP and, more widely, artificial intelligence (AI), allowing knowledge to be reused in new settings at a fraction of the previously required resources.
As a citizen of the West African nation of Ghana, where many budding entrepreneurs and inventors do not have access to vast computing resources and where so many fundamental NLP problems remain to be solved, this topic is particularly personal to me. This paradigm empowers engineers in such settings to build potentially life-saving NLP technologies, which would simply not be possible otherwise.
I first encountered these ideas in 2017, while working on open source automatic machine learning technologies within the US Defense Advanced Research Projects Agency (DARPA) ecosystem. We used transfer learning to reduce the requirement for labeled data by training NLP systems on simulated data first and then transferring the model to a small set of real labeled data. The breakthrough model ELMo emerged shortly after and inspired me to learn more about the topic and explore how I could leverage these ideas further in my software projects.
Naturally, I discovered that a comprehensive practical introduction to the topic did not exist, due to the sheer novelty of these ideas and the speed at which the field is moving. When an opportunity to write a practical introduction to the topic presented itself in 2019, I didnt think twice. You are holding in your hands the product of approximately two years of effort toward this purpose. This book will quickly bring you up to speed on key recent NLP models in the space and provide executable code you will be able to modify and reuse directly in your own projects. Although it would be impossible to cover every single architecture and use case, we strategically cover architectures and examples that we believe will arm you with fundamental skills for further exploration and staying up-to-date in this burgeoning field on your own.
You made a good decision when you decided to learn more about this topic. Opportunities for novel theories, algorithmic methodologies, and breakthrough applications abound. I look forward to hearing about the transformational positive impact you make on the society around you with it.
acknowledgments
I am grateful to members of the NLP Ghana open source community, where I have had the privilege to learn more about this important topic. The feedback from members of the group and users of our tools has served to underscore my understanding of how transformational this technology truly is. This has inspired and motivated me to push this book across the finish line.
I would like to thank my Manning development editor, Susan Ethridge, for the uncountable hours spent reading the manuscript, providing feedback, and guiding me through the many challenges. I am thankful for all the time and effort my technical development editor, Al Krinker, put in to help me improve the technical dimension of my writing.
I am grateful to all members of the editorial board, the marketing professionals, and other members of the production team that worked hard to make this book a reality. In no particular order, these include Rebecca Rinehart, Bert Bates, Nicole Butterfield, Rejhana Markanovic, Aleksandar Dragosavljevic, Melissa Ice, Branko Latincic, Christopher Kaufmann, Candace Gillhoolley, Becky Whitney, Pamela Hunt, and Radmila Ercegovac.
The technical peer reviewers provided invaluable feedback at several junctures during this project, and the book would not be nearly as good without them. I am very grateful for their input. These include Andres Sacco, Angelo Simone Scotto, Ariel Gamino, Austin Poor, Clifford Thurber, Diego Casella, Jaume Lpez, Manuel R. Ciosici, Marc-Anthony Taylor, Mathijs Affourtit, Matthew Sarmiento, Michael Wall, Nikos Kanakaris, Ninoslav Cerkez, Or Golan, Rani Sharim, Sayak Paul, Sebastin Palma, Sergio Govoni, Todd Cook, and Vamsi Sistla. I am thankful to the technical proofreader, Ariel Gamio, for catching many typos and other errors during the proofreading process. I am grateful to all the excellent comments from book forum participants that further helped improve the book.
Next page