• Complain

Wee Hyong Tok - Practical Weak Supervision: Doing More with Less Data

Here you can read online Wee Hyong Tok - Practical Weak Supervision: Doing More with Less Data full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2021, publisher: OReilly Media, genre: Home and family. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Wee Hyong Tok Practical Weak Supervision: Doing More with Less Data

Practical Weak Supervision: Doing More with Less Data: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Practical Weak Supervision: Doing More with Less Data" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Most data scientists and engineers today rely on quality labeled data to train machine learning models. But building a training set manually is time-consuming and expensive, leaving many companies with unfinished ML projects. Theres a more practical approach. In this book, Wee Hyong Tok, Amit Bahree, and Senja Filipi show you how to create products using weakly supervised learning models.

Youll learn how to build natural language processing and computer vision projects using weakly labeled datasets from Snorkel, a spin-off from the Stanford AI Lab. Because so many companies have pursued ML projects that never go beyond their labs, this book also provides a guide on how to ship the deep learning models you build.

  • Get up to speed on the field of weak supervision, including ways to use it as part of the data science process
  • Use Snorkel AI for weak supervision and data programming
  • Get code examples for using Snorkel to label text and image datasets
  • Use a weakly labeled dataset for text and image classification
  • Learn practical considerations for using Snorkel with large datasets and using Spark clusters to scale labeling

Wee Hyong Tok: author's other books


Who wrote Practical Weak Supervision: Doing More with Less Data? Find out the surname, the name of the author of the book and a list of all author's works by series.

Practical Weak Supervision: Doing More with Less Data — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Practical Weak Supervision: Doing More with Less Data" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Practical Weak Supervision by Wee Hyong Tok Amit Bahree and Senja Filipi - photo 1
Practical Weak Supervision

by Wee Hyong Tok , Amit Bahree , and Senja Filipi

Copyright 2022 Wee Hyong Tok, Amit Bahree, and Senja Filipi. All rights reserved.

Printed in the United States of America.

Published by OReilly Media, Inc. , 1005 Gravenstein Highway North, Sebastopol, CA 95472.

OReilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com .

  • Acquisitions Editor: Rebecca Novack
  • Development Editor: Jeff Bleiel
  • Production Editor: Kristen Brown
  • Copyeditor: nSight, Inc.
  • Proofreader: Piper Editorial Consulting, LLC
  • Indexer: Ellen Troutman-Zaig
  • Interior Designer: David Futato
  • Cover Designer: Karen Montgomery
  • Illustrator: Kate Dullea
  • October 2021: First Edition
Revision History for the First Edition
  • 2021-09-30: First Release

See http://oreilly.com/catalog/errata.csp?isbn=9781492077060 for release details.

The OReilly logo is a registered trademark of OReilly Media, Inc. Practical Weak Supervision, the cover image, and related trade dress are trademarks of OReilly Media, Inc.

The views expressed in this work are those of the authors, and do not represent the publishers views. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.

978-1-492-07706-0

[LSI]

Foreword by Xuedong Huang

In specific industry scenarios AI systems can be brittle, and they often require heavy customization with lots of additional data to build machine learning models that help solve for those specific scenarios. However, with diverse data points, and the ability to combine these disparate data types together, there are new opportunities to pretrain machine learning models that are foundational to the downstream workloads. These models often require much less supervision for customization allowing for greater speed and agility at lower cost.

Transfer learning is a winning approach when combined with weak supervision. The foundational model can be strengthened with amazing gains. For example, when looking at very large pretrained speech, NLP, and computer vision models, weak supervision from big data can often produce a competent and sufficient quality allowing one to further compensate for limited data in the downstream task.

Finally, when building AI systems, one key challenge is to understand and act on user engagement signals. These signals are dynamic and weak by their nature. Combining weak supervision and reinforcement learning enables AI systems to learn which actions can solve for which tasks. The result is a high-quality dataset and an optimized model.

Over the last 30 years, I have had the privilege of working with many world-class researchers and engineers in creating what the world now sees as Microsofts Azure AI services. Amit Bahree, Senja Filipi, and Wee Hyong Tok are some of my amazing colleagues who have dealt with practical AI challenges in serving our customers. In this book, they show techniques for weak supervision that will benefit anyone involved in creating production AI systems.

I hope you enjoy this book as much as I have. Amit, Senja, and Wee Hyong show us a practical approach to help address many of the AI challenges that we face in the industry.

Xuedong Huang

Technical Fellow and Azure AI CTO, Microsoft

Bellevue, WA

September 2021

Foreword by Alex Ratner

The real-world impact of artificial intelligence (AI) has grown substantially in recent years, largely due to the advent of deep learning models. These models are more powerful and push-button than ever beforelearning their own powerful, distributed representations directly from raw data with minimal to no manual feature engineering across diverse data and task types. They are also increasingly commoditized and accessible in the open source.

However, deep learning models are also more data hungry than ever, requiring massive, carefully labeled training datasets to function. In a world where the latest and greatest model architectures are downloadable in seconds and the powerful hardware needed to train them is a click away in the cloud, access to high-quality labeled training data has become a major differentiator across both industry and academia. More succinctly, we have left the age of model-centric AI and are entering the era of data-centric AI.

Unfortunately, labeling data at the scale and quality required to trainor superviseuseful AI models tends to be both expensive and time-consuming because it requires manual human input over huge numbers of examples. Person-years of data labeling per model is not uncommon, and when model requirements changesay, to classify medical images as normal, abnormal, or emergent rather than just normal or abnormaldata must often be relabeled from scratch. When organizations are deploying tens, hundreds, or even thousands of ML models that must be constantly iterated upon and retrained to keep up with ever-changing real-world data distributions, hand-labeling simply becomes untenable even for the worlds largest organizations.

For the new data-centric AI reality to become practical and productionized, the next generation of AI systems must embody three key principles:

Data as the central interface

Dataand specifically, training datais often the key to success or failure in AI today; it can no longer be treated like a second-class citizen. Data must be at the center of iterative development in AI, and properly supported as the key interface to building and managing successful AI applications.

Data as a programmatic interface

For data to be the center point of AI development, we must move beyond the inefficient status quo of labeling and manipulating it by hand, one data point at a time. Users must be able to develop and manage the training data that defines AI models programmatically, like they would in developing any other type of practical software system.

Data as a collaborative hub

For AI to be data-centric, the subject-matter experts who actually understand the data and how to label it must be first-class citizens of the development process alongside data scientists and ML engineers.

Enter weak supervision. Instead of hand-labeling data, researchers have developed techniques that leverage more efficient, programmatic, and sometimes noisier forms of supervisionfor example, rules, heuristics, knowledge bases, and moreto create weakly labeled datasets upon which high-quality AI models can be rapidly built. These weaker forms of supervision can often be defined programmatically and can often be directly developed by subject-matter experts. AI models that used to need person-years of labeled data can now be built using only person-days of effort and managed programmatically in more transparent and adaptable ways, without impacting performance or quality. Organizations large and small have taken note of this fundamental change in how AI models are built and managed; in fact, in the last hour you have almost certainly used a weakly supervised AI system in your day-to-day life. In the world of data-centric AI, weak supervision has become a foundational tool.

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Practical Weak Supervision: Doing More with Less Data»

Look at similar books to Practical Weak Supervision: Doing More with Less Data. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Practical Weak Supervision: Doing More with Less Data»

Discussion, reviews of the book Practical Weak Supervision: Doing More with Less Data and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.