inside front cover
Quick reference guide for this book
Understand a models uncertainty | Recruit annotators |
Softmax Base/Temperature (3.1.2, A.1-A.2) | In-house experts (7.2) |
Least Confidence (3.2.1) | Outsourced workers (7.3) |
Margin of Confidence (3.2.2) | Crowdsourced workers (7.4) |
Ratio of Confidence (3.2.3) | End users (7.5.1) |
Entropy (3.2.4) | Volunteers (7.5.2) |
Ensembles of Models (3.4.1) | People playing games (7.5.3) |
Query by Committee & Dropouts (3.4.2) | Manage annotation quality |
Aleatoric & Epistemic Uncertainty (3.4.3) | Ground truth data (8.1.1) |
Active Transfer Learning for Uncertainty (5.2) | Expected accuracy/agreement & adjusting for random chance (8.1.2, A.3.3) |
Identify gaps in a models knowledge | Dataset reliability with Krippendorff's alpha (8.2.3) |
Model-based Outliers (4.2, 4.6.1) | Individual annotator agreement (8.2.5) |
Cluster-based Sampling (4.3, 4.6.2) | Per-label & per-demographic agreement (8.2.6) |
Representative Sampling (4.4, 4.6.3) | Extending accuracy with agreement for real-world diversity (8.2.7) |
Real-world Diversity (4.5, 4.6.4) | Aggregating annotations (8.3.1-3, 9.2.1-2) |
Active Transfer Learning for Representative Sampling (5.3) | Eliciting annotator-reported confidences (8.3.4) |
Create a complete active learning strategy | Calculating annotation uncertainty (8.3.5) |
Combining Uncertainty Sampling and Diversity Sampling (5.1.1-6) | Quality control by expert review (8.4) |
Expected Error Reduction (5.1.8) | Multistep workflows and adjudication/review tasks (8.5) |
Active Transfer Learning for Adaptive Sampling (5.4) | Creating models to predict whether a single annotation is: correct (9.2.3) in agreement (9.2.4) from a bot (9.2.5) |
Active Learning for already-labeled data (6.6.1) |
Data-filtering with rules (9.5.1) |
Training data search (9.5.2) |
Implement active learning with different machine learning architectures | Trust model predictions as labels (9.3.1) |
Logistic Regression & MaxEnt (3.3.1) | Use a model prediction as an annotation (9.3.2) |
Support Vector Machines (3.3.2) | Cross-validating to find mislabeled data (9.3.3) |
Bayesian Models (3.3.3) |
Decision Trees & Random Forests (3.3.4) |
Diversity Sampling (4.6.1-4) |
Human-in-the-Loop Machine Learning
Active learning and annotation for human-centered AI
Robert (Munro) Monarch
Foreword by Christopher D. Manning
To comment go to liveBook
Manning
Shelter Island
For more information on this and other Manning titles go to
www.manning.com
Copyright
For online information and ordering of these and other Manning books, please visit www.manning.com. The publisher offers discounts on these books when ordered in quantity.
For more information, please contact
Special Sales Department
Manning Publications Co.
20 Baldwin Road
PO Box 761
Shelter Island, NY 11964
Email: orders@manning.com
2021 by Manning Publications Co. All rights reserved.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.
Recognizing the importance of preserving what has been written, it is Mannings policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine.
| Manning Publications Co. 20 Baldwin Road Technical PO Box 761 Shelter Island, NY 11964 |
Development editor: | Susan Ethridge |
Technical development editor: | Frances Buontempo |
Review editor: | Ivan Martinovi |
Production editor: | Deirdre S. Hiam |
Copy editor: | Keir Simpson |
Proofreader: | Keri Hales |
Technical proofreader: | Al Krinkerr |
Typesetter: | Gordan Salinovi |
Cover designer: | Marija Tudor |
ISBN: 9781617296741
front matter
foreword
With machine learning now deployed widely in many industry sectors, artificial intelligence systems are in daily contact with human systems and human beings. Most people have noticed some of the user-facing consequences. Machine learning can either improve peoples lives, such as with the speech recognition and natural language understanding of a helpful voice assistant, or it can annoy or even actively harm humans, with examples ranging from annoyingly lingering product recommendations to rsum review systems that are systematically biased against women or under-represented ethnic groups. Rather than thinking about artificial intelligence operating in isolation, the pressing need this century is for the exploration of human-centered artificial intelligencethat is, building AI technology that effectively cooperates and collaborates with people, and augments their abilities.
This book focuses not on end users but on how people and machine learning come together in the production and running of machine learning systems. It is an open secret of machine learning practitioners in industry that obtaining the right data with the right annotations is many times more valuable than adopting a more advanced machine learning algorithm. The production, selection, and annotation of data is a very human endeavor. Hand-labeling data can be expensive and unreliable, and this book spends much time on this problem. One direction is to reduce the amount of data that needs to be labeled while still allowing the training of high-quality systems through active learning approaches. Another direction is to exploit machine learning and humancomputer interaction techniques to improve the speed and accuracy of human annotation. Things do not stop there: most large, deployed systems also involve various kinds of human review and updating. Again, the machine learning can either be designed to leverage the work of people, or it can be something that humans need to fight against.