• Complain

Alice Zheng - Feature Engineering for Machine Learning

Here you can read online Alice Zheng - Feature Engineering for Machine Learning full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2018, publisher: OReilly Media, Inc., genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Alice Zheng Feature Engineering for Machine Learning
  • Book:
    Feature Engineering for Machine Learning
  • Author:
  • Publisher:
    OReilly Media, Inc.
  • Genre:
  • Year:
    2018
  • Rating:
    4 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 80
    • 1
    • 2
    • 3
    • 4
    • 5

Feature Engineering for Machine Learning: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Feature Engineering for Machine Learning" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Feature engineering is a crucial step in the machine-learning pipeline, yet this topic is rarely examined on its own. With this practical book, youll learn techniques for extracting and transforming features--the numeric representations of raw data--into formats for machine-learning models. Each chapter guides you through a single data problem, such as how to represent text or image data. Together, these examples illustrate the main principles of feature engineering. Rather than simply teach these principles, authors Alice Zheng and Amanda Casari focus on practical application with exercises throughout the book.

Alice Zheng: author's other books


Who wrote Feature Engineering for Machine Learning? Find out the surname, the name of the author of the book and a list of all author's works by series.

Feature Engineering for Machine Learning — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Feature Engineering for Machine Learning" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Feature Engineering for Machine Learning

by Alice Zheng and Amanda Casari

Copyright 2018 Alice Zheng, Amanda Casari. All rights reserved.

Printed in the United States of America.

Published by OReilly Media, Inc. , 1005 Gravenstein Highway North, Sebastopol, CA 95472.

OReilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com/safari). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com .

Editors: Rachel Roumeliotis and Jeff Bleiel

Indexer: Ellen Troutman

Production Editor: Kristen Brown

Interior Designer: David Futato

Copyeditor: Rachel Head

Cover Designer: Karen Montgomery

Proofreader: Sonia Saruba

Illustrator: Rebecca Demarest

  • April 2018: First Edition
Revision History for the First Edition
  • 2018-03-23: First Release

See http://oreilly.com/catalog/errata.csp?isbn=9781491953242 for release details.

The OReilly logo is a registered trademark of OReilly Media, Inc. Feature Engineering for Machine Learning, the cover image, and related trade dress are trademarks of OReilly Media, Inc.

While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.

978-1-491-95324-2

[LSI]

Preface
Introduction

Machine learning fits mathematical models to data in order to derive insights or make predictions. These models take features as input. A feature is a numeric representation of an aspect of raw data. Features sit between data and models in the machine learning pipeline. Feature engineering is the act of extracting features from raw data and transforming them into formats that are suitable for the machine learning model. It is a crucial step in the machine learning pipeline, because the right features can ease the difficulty of modeling, and therefore enable the pipeline to output results of higher quality. Practitioners agree that the vast majority of time in building a machine learning pipeline is spent on feature engineering and data cleaning. Yet, despite its importance, the topic is rarely discussed on its own. Perhaps this is because the right features can only be defined in the context of both the model and the data; since data and models are so diverse, its difficult to generalize the practice of feature engineering across projects.

Nevertheless, feature engineering is not just an ad hoc practice. There are deeper principles at work, and they are best illustrated in situ. Each chapter of this book addresses one data problem: how to represent text data or image data, how to reduce the dimensionality of autogenerated features, when and how to normalize, etc. Think of this as a collection of interconnected short stories, as opposed to a single long novel. Each chapter provides a vignette into the vast array of existing feature engineering techniques. Together, they illustrate the overarching principles.

Mastering a subject is not just about knowing the definitions and being able to derive the formulas. It is not enough to know how the mechanism works and what it can doone must also understand why it is designed that way, how it relates to other techniques, and what the pros and cons of each approach are. Mastery is about knowing precisely how something is done, having an intuition for the underlying principles, and integrating it into ones existing web of knowledge. One does not become a master of something by simply reading a book, though a good book can open new doors. It has to involve practiceputting the ideas to use, which is an iterative process. With every iteration, we know the ideas better and become increasingly more adept and creative at applying them. The goal of this book is to facilitate the application of its ideas.

This book tries to teach the reason first, and the mathematics second. Instead of only discussing how something is done, we try to teach why. Our goal is to provide the intuition behind the ideas, so that the reader may understand how and when to apply them. There are tons of descriptions and pictures for folks who learn in different ways. Mathematical formulas are presented in order to make the intuition precise, and also to bridge this book with other existing offerings.

Code examples in this book are given in Python, using a variety of free and open source packages. The NumPy library provides numeric vector and matrix operations. Pandas provides the DataFrame that is the building block of data science in Python. Scikit-learn is a general-purpose machine learning package with extensive coverage of models and feature transformers. Matplotlib and the styling library Seaborn provide plotting and visualization support. You can find these examples as Jupyter notebooks in our GitHub repo.

The first few chapters start out slow in order to provide a bridge for folks who are just getting started with data science and machine learning. by showing a few different techniques in an end-to-end example, creating a recommender for a dataset of academic papers.

In Living Color

The illustrations in this book are best viewed in color. Really, you should print out the color versions of the Swiss roll in and paste them into your book. Your aesthetic sense will thank us.

Feature engineering is a vast topic, and more methods are being invented every day, particularly in the area of automatic feature learning. In order to limit the book to a manageable size, weve had to make some cuts. This book does not discuss Fourier analysis for audio data, though it is a beautiful subject that is closely related to eigen analysis in linear algebra (which we touch upon in Chapters ). We also skip a discussion of random features, which are intimately related to Fourier analysis. We provide an introduction to feature learning via deep learning for image data, but do not go into depth on the numerous deep learning models under active development. Also out of scope are advanced research ideas like random projections, complex text featurization models such as word2vec and Brown clustering, and latent space models like Latent Dirichlet allocation and matrix factorization. If those words mean nothing to you, then you are in luck. If the frontiers of feature learning are where your interest lies, then this is probably not the book for you.

The book assumes knowledge of basic machine learning concepts, such as what a model is and what a vector is, though a refresher is provided so were all on the same page. Experience with linear algebra, probability distributions, and optimization are helpful, but not necessary.

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic

Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Feature Engineering for Machine Learning»

Look at similar books to Feature Engineering for Machine Learning. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Feature Engineering for Machine Learning»

Discussion, reviews of the book Feature Engineering for Machine Learning and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.