LitArk » Books » Computer

Tshepo Chris Nokeri - Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning

Here you can read online Tshepo Chris Nokeri - Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2021, publisher: Apress, genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Book:
Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning
Author:
Tshepo Chris Nokeri
Publisher:
Apress
Genre:
Books / Computer
Year:
2021
Rating:
4 / 5
Favourites:
Add to favourites
Your mark:
- 80
- 1
- 2
- 3
- 4
- 5

Description
Author's other books
Similar books

Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Get insight into data science techniques such as data engineering and visualization, statistical modeling, machine learning, and deep learning. This book teaches you how to select variables, optimize hyper parameters, develop pipelines, and train, test, and validate machine and deep learning models. Each chapter includes a set of examples allowing you to understand the concepts, assumptions, and procedures behind each model.

The book covers parametric methods or linear models that combat under- or over-fitting using techniques such as Lasso and Ridge. It includes complex regression analysis with time series smoothing, decomposition, and forecasting. It takes a fresh look at non-parametric models for binary classification (logistic regression analysis) and ensemble methods such as decision trees, support vector machines, and naive Bayes. It covers the most popular non-parametric method for time-event data (the Kaplan-Meier estimator). It also covers ways of solving classification problems using artificial neural networks such as restricted Boltzmann machines, multi-layer perceptrons, and deep belief networks. The book discusses unsupervised learning clustering techniques such as the K-means method, agglomerative and Dbscan approaches, and dimension reduction techniques such as Feature Importance, Principal Component Analysis, and Linear Discriminant Analysis. And it introduces driverless artificial intelligence using H2O.

After reading this book, you will be able to develop, test, validate, and optimize statistical machine learning and deep learning models, and engineer, visualize, and interpret sets of data.

What You Will Learn

Design, develop, train, and validate machine learning and deep learning models
Find optimal hyper parameters for superior model performance
Improve model performance using techniques such as dimension reduction and regularization
Extract meaningful insights for decision making using data visualization

Who This Book Is For
Beginning and intermediate level data scientists and machine learning engineers

Tshepo Chris Nokeri: author's other books

Who wrote Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning? Find out the surname, the name of the author of the book and a list of all author's works by series.

Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Contents

Landmarks

Book cover of Data Science Revealed

Tshepo Chris Nokeri

Data Science Revealed

With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning

1st ed.

Logo of the publisher

Tshepo Chris Nokeri

Pretoria, South Africa

Any source code or other supplementary material referenced by the author in this book is available to readers on GitHub via the books product page, located at www.apress.com/978-1-4842-6869-8 . For more detailed information, please visit www.apress.com/source-code .

ISBN 978-1-4842-6869-8 e-ISBN 978-1-4842-6870-4

https://doi.org/10.1007/978-1-4842-6870-4

Tshepo Chris Nokeri 2021

Apress Standard

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Distributed to the book trade worldwide by Springer Science+Business Media New York, 1 New York Plaza, Suite 4600, New York, NY 10004-1562, USA. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-sbm.com, or visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation.

I dedicate this book to my family and everyone who merrily played influential roles in my life.

Introduction

Welcome to Data Science Revealed. This book is your guide to solving practical and real-world problems using data science procedures. It gives insight into data science techniques, such as data engineering and visualization, statistical modeling, machine learning, and deep learning. It has a rich set of examples on how to select variables, optimize hyperparameters, develop pipelines, and train, test and validate machine and deep learning models. Each chapter contains a set of examples allowing you to understand the concepts, assumptions, and procedures behind each model.

First, it conceals the parametric method or linear model and the means for combating underfitting or overfitting using regularization techniques such as lasso and ridge. Next, it concludes complex regression by presenting time-series smoothening, decomposition, and forecasting. Then, it takes a fresh look at a nonparametric model for binary classification, known as logistic regression, and ensemble methods such as decision tree, support vector machine, and nave Bayes. Next, it covers the most popular nonparametric method for time-event data, recognized as the Kaplan-Meier estimator. It also covers ways of solving a classification problem using artificial neural networks, like the restricted Boltzmann machine, multilayer perceptron, and deep belief network. Then, it summarizes unsupervised learning by uncovering clustering techniques, such as K-means, agglomerative and DBSCAN, and dimension reduction techniques such as feature importance, principal component analysis, and linear discriminant analysis. In addition, it introduces driverless artificial intelligence using H2O.

It uses Anaconda (an open source distribution of Python programming) to prepare the examples. The following are some of the libraries covered in this book:

Pandas for data structures and tools
Statsmodels for basic statistical computation and modeling
SciKit-Learn for building and validating key machine learning algorithms
Prophet for time-series analysis
Keras for high-level frameworks for deep learning
H2O for driverless machine learning
Lifelines for survival analysis
NumPy for arrays and matrices
SciPy for integrals, solving differential equations and optimization
Matplotlib and Seaborn for popular plots and graphs

This book targets beginner to intermediate data scientists and machine learning engineers who want to learn the full data science process. Before exploring the contents of this book, ensure that you understand the basics of statistics, Python programming, and probability theories. Also, youll need to install the packages mentioned in the previous list in your environment.

Acknowledgments

This is my first book, which makes it significant to me. Writing a single-authored book is demanding, but I received firm support and active encouragement from my family and dear friends. Many heartfelt thanks to Professor Chris William Callaghan and Mrs. Renette Krommenhoek from the University of the Witwatersrand. They gallantly helped spark my considerable interest in combating practical and real-world problems using advanced analytics. I would not have completed this book without the valuable help of the dedicated publishing team at Apress, which compromises Aditee Mirashi and Celestin Suresh John. They trusted and ushered me throughout the writing and editing process. Last, humble thanks to all of you reading this; I earnestly hope you find it helpful.

Table of Contents

About the Author

Tshepo Chris Nokeri

harnesses advanced analytics and artificial intelligence to foster innovation - photo 3

harnesses advanced analytics and artificial intelligence to foster innovation and optimize business performance. In his functional work, he delivered complex solutions to companies in the mining, petroleum, and manufacturing industries. He initially completed a bachelors degree in information management. Afterward, he graduated with an honors degree in business science at the University of the Witwatersrand on a TATA Prestigious Scholarship and a Wits Postgraduate Merit Award. They unanimously awarded him the Oxford University Press Prize.

About the Technical Reviewer

Manohar Swamynathan

is a data science practitioner and an avid programmer with more than 14 years - photo 4

is a data science practitioner and an avid programmer, with more than 14 years of experience in various data sciencerelated areas that include data warehousing, business intelligence (BI), analytical tool development, ad hoc analysis, predictive modeling, data science product development, consulting, and formulating strategy and executing analytics programs. He has had a career covering the life cycle of data across different domains such as US mortgage banking, retail/e-commerce, insurance, and industrial IoT. He has a bachelors degree with a specialization in physics, mathematics, and computers, and a masters degree in project management. Hes currently living in Bengaluru, the Silicon Valley of India.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Similar books «Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning»

Look at similar books to Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.

Curtis Miller

Training Systems Using Python Statistical Modeling: Explore popular techniques for modeling your data in Python

Benjamin Johnston

Applied Supervised Learning with Python: Use scikit-learn to build predictive models from real-world datasets and prepare yourself for the future of machine learning

Tshepo Chris Nokeri

Econometrics and Data Science: Apply Data Science Techniques to Model Complex Problems and Implement Solutions for Economic Problems

Tshepo Chris Nokeri

Implementing Machine Learning for Finance

Tshepo Chris Nokeri

Data Science Solutions with Python: Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn

Miller

Statistics for data science: leverage the power of statistics for data analysis, classification, regression, machine learning, and neural networks

Forte

Mastering predictive analytics with R: master the craft of predictive modeling by developing strategy, intuition, and a solid foundation in essential concepts

Lantz

Machine learning with R: expert techniques for predictive modeling

Albon

Machine learning with Python cookbook: practical solutions from preprocessing to deep learning

Md. Rezaul Karim

Machine Learning with Scala Quick Start Guide: Leverage popular machine learning algorithms and techniques and implement them in Scala

Matt Wiley

Advanced R Statistical Programming and Data Models: Analysis, Machine Learning, and Visualization

Pratap Dangeti

Statistics for Machine Learning: Techniques for exploring supervised, unsupervised, and reinforcement learning models with Python and R

Reviews about «Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning»

Discussion, reviews of the book Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.