• Complain

Kevin Feasel - Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python

Here you can read online Kevin Feasel - Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. City: New York, year: 2022, publisher: Apress, genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Kevin Feasel Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python
  • Book:
    Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python
  • Author:
  • Publisher:
    Apress
  • Genre:
  • Year:
    2022
  • City:
    New York
  • Rating:
    3 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 60
    • 1
    • 2
    • 3
    • 4
    • 5

Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Discover key information buried in the noise of data by learning a variety of anomaly detection techniques and using the Python programming language to build a robust service for anomaly detection against a variety of data types. The book starts with an overview of what anomalies and outliers are and uses the Gestalt school of psychology to explain just why it is that humans are naturally great at detecting anomalies. From there, you will move into technical definitions of anomalies, moving beyond I know it when I see it to defining things in a way that computers can understand.
The core of the book involves building a robust, deployable anomaly detection service in Python. You will start with a simple anomaly detection service, which will expand over the course of the book to include a variety of valuable anomaly detection techniques, covering descriptive statistics, clustering, and time series scenarios. Finally, you will compare your anomaly detection service head-to-head with a publicly available cloud offering and see how they perform.
The anomaly detection techniques and examples in this book combine psychology, statistics, mathematics, and Python programming in a way that is easily accessible to software developers. They give you an understanding of what anomalies are and why you are naturally a gifted anomaly detector. Then, they help you to translate your human techniques into algorithms that can be used to program computers to automate the process. Youll develop your own anomaly detection service, extend it using a variety of techniques such as including clustering techniques for multivariate analysis and time series techniques for observing data over time, and compare your service head-on against a commercial service.
What You Will Learn
  • Understand the intuition behind anomalies
  • Convert your intuition into technical descriptions of anomalous data
  • Detect anomalies using statistical tools, such as distributions, variance and standard deviation, robust statistics, and interquartile range
  • Apply state-of-the-art anomaly detection techniques in the realms of clustering and time series analysis
  • Work with common Python packages for outlier detection and time series analysis, such as scikit-learn, PyOD, and tslearn
  • Develop a project from the ground up which finds anomalies in data, starting with simple arrays of numeric data and expanding to include multivariate inputs and even time series data

Who This Book Is For
For software developers with at least some familiarity with the Python programming language, and who would like to understand the science and some of the statistics behind anomaly detection techniques. Readers are not required to have any formal knowledge of statistics as the book introduces relevant concepts along the way.

Kevin Feasel: author's other books


Who wrote Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python? Find out the surname, the name of the author of the book and a list of all author's works by series.

Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make

Finding Ghosts in Your Data Anomaly Detection Techniques with Examples in - photo 1

Finding Ghosts in Your

Data

Anomaly Detection Techniques

with Examples in Python

Kevin Feasel

Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples

in Python

Kevin Feasel

DURHAM, NC, USA

ISBN-13 (pbk): 978-1-4842-8869-6

ISBN-13 (electronic): 978-1-4842-8870-2

https://doi.org/10.1007/978-1-4842-8870-2

Copyright 2022 by Kevin Feasel

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.

Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark.

The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.

While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.

Managing Director, Apress Media LLC: Welmoed Spahr

Acquisitions Editor: Jonathan Gennick

Development Editor: Laura Berendson

Coordinating Editor: Jill Balzano

Cover photo by Pawel Czerwinski on Unsplash

Distributed to the book trade worldwide by Springer Science+Business Media LLC, 1 New York Plaza, Suite 4600, New York, NY 10004. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-sbm.

com, or visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation.

For information on translations, please e-mail booktranslations@springernature.com; for reprint, paperback, or audio rights, please e-mail bookpermissions@springernature.com.

Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook versions and licenses are also available for most titles. For more information, reference our Print and eBook Bulk Sales web page at http://www.apress.com/bulk-sales.

Any source code or other supplementary material referenced by the author in this book is available to readers on GitHub. For more detailed information, please visit http://www.apress.com/source- code.

Printed on acid-free paper

To Mom and Dad, who know a thing or four about anomalies.

Table of Contents

About the Author xv About the Technical Reviewer xvii Introduction xix Part I: What Is an Anomaly? 1

Chapter 1: The Importance of Anomalies and Anomaly Detection 3

Defining Anomalies 3

Outlier 3

Noise vs Anomalies 4

Diagnosing an Example 5

What If Were Wrong? 7

Anomalies in the Wild 8

Finance 8

Medicine 11

Sports Analytics 11

Web Analytics 14

And Many More 15

Classes of Anomaly Detection 16

Statistical Anomaly Detection 16

Clustering Anomaly Detection 16

Model-Based Anomaly Detection 17

Building an Anomaly Detector 18

Key Goals 18

How Do Humans Handle Anomalies? 19

Known Unknowns 21

Conclusion 22

v

Table of ConTenTs

Chapter 2: Humans Are Pattern Matchers 23

A Primer on the Gestalt School 23

Key Findings of the Gestalt School 24

Emergence 24

Reification 25

Invariance 26

Multistability 27

Principles Implied in the Key Findings 28

Meaningfulness 28

Conciseness 29

Closure 30

Similarity 31

Good Continuation 32

Figure and Ground 34

Proximity 35

Connectedness 35

Common Region 35

Symmetry 36

Common Fate 37

Synchrony 38

Helping People Find Anomalies 39

Use Color As a Signal 39

Limit Nonmeaningful Information 40

Enable Connecting the Dots 40

Conclusion 41

Chapter 3: Formalizing Anomaly Detection 43

The Importance of Formalization 43

Ill Know It When I See It Isnt Enough 43

Human Fallibility 44

Marginal Outliers 44

The Limits of Visualization 45

vi

Table of ConTenTs

The First Formal Tool: Univariate Analysis 46

Distributions and Histograms 46

The Normal Distribution 49

Mean, Variance, and Standard Deviation 51

Additional Distributions 54

Robustness and the Mean 58

The Susceptibility of Outliers 58

The Median and Robust Statistics 58

Beyond the Median: Calculating Percentiles 59

Control Charts 61

Conclusion 62

Part II: Building an Anomaly Detector 63

Chapter 4: Laying Out the Framework 65

Tools of the Trade 65

Choosing a Programming Language 65

Making Plumbing Choices 66

Reducing Architectural Variables 68

Developing an Initial Framework 69

Battlespace Preparation 69

Framing the API 70

Input and Output Signatures 72

Defining a Common Signature 73

Defining an Outlier 74

Sensitivity and Fraction of Anomalies 74

Single Solution 75

Combined Arms 75

Framing the Solution 76

Containerizing the Solution 79

Conclusion 80

vii

Table of ConTenTs

Chapter 5: Building a Test Suite 81

Tools of the Trade 81

Unit Test Library 82

Integration Testing 82

Writing Testable Code 83

Keep Methods Separated 83

Emphasize Use Cases 84

Functional or Clean: Your Choice 84

Creating the Initial Tests 86

Unit Tests 86

Integration Tests 90

Conclusion 94

Chapter 6: Implementing the First Methods 95

A Motivating Example 95

Ensembling As a Technique 96

Sequential Ensembling 97

Independent Ensembling 98

Choosing Between Sequential and Independent Ensembling 99

Implementing the First Checks 99

Standard Deviations from the Mean 100

Median Absolute Deviations from the Median 101

Distance from the Interquartile Range 102

Completing the run_tests( ) Function 103

Building a Scoreboard 104

Weighting Results 105

Determining Outliers 106

Updating Tests 109

Updating Unit Tests 109

Updating Integration Tests 114

Conclusion 116

viii

Table of ConTenTs

Chapter 7: Extending the Ensemble 117

Adding New Tests 117

Checking for Normality 118

Approaching Normality 123

A Framework for New Tests 126

Grubbs Test for Outliers 128

Generalized ESD Test for Outliers 129

Dixons Q Test 131

Calling the Tests 133

Updating Tests 135

Updating Unit Tests 135

Updating Integration Tests 140

Multi-peaked Data 141

A Hidden Assumption 141

The Solution: A Sneak Peek 143

Conclusion 144

Chapter 8: Visualize the Results 145

Building a Plan 145

What Do We Want to Show? 145

How Do We Want to Show It? 146

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python»

Look at similar books to Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python»

Discussion, reviews of the book Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.