Eric Goh Ming Hui
Singapore, Singapore
Any source code or other supplementary material referenced by the author in this book is available to readers on GitHub via the books product page, located at www.apress.com/978-1-4842-4199-8 . For more detailed information, please visit www.apress.com/source-code .
ISBN 978-1-4842-4199-8 e-ISBN 978-1-4842-4200-1
https://doi.org/10.1007/978-1-4842-4200-1
Library of Congress Control Number: 2018965216
Eric Goh Ming Hui 2019
Apress Standard Test
Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.
While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.
Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-sbm.com, or visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation.
Introduction
Who is this book for?
This book is primarily targeted to programmers or learners who want to learn R programming for statistics. This book will cover using R programming for descriptive statistics, inferential statistics, regression analysis, and data visualizations.
How is this book structured?
The structure of the book is determined by following two requirements:
Topic | Chapters |
---|
Introduction to R and R programming fundamentals | 1 to 3 |
Descriptive statistics, data visualizations, inferential statistics, and regression analysis | 4 to 6 |
Contacting the Author
More information about Eric Goh can be found at www.svbook.com . He can be reached at gohminghui88@svbook.com .
Acknowledgments
Let me begin by thanking Celestin Suresh John, the Acquisition Editor and Manager, for the LinkedIn message that triggered this project. Thanks to Amrita Stanley, project manager of this book, for her professionalism.
It took a team to make this book, and it is my great pleasure to acknowledge the hard work and smart work of Apress team. The following are a few names to mention: Matthew Moodie, the Development Editor; Divya Modi, the Coordinating Editor; Mary Behr for copy editing; Kannan Chakravarthy for proofreading; Irfanullah for indexing; eStudioCalamar and Freepik for image editing; Krishnan Sathyamurthy for managing the production process; and Parameswari Sitrambalam for composing. I am also thankful to Preeti Pandhu, the technical reviewer, for thoroughly reviewing this book and offering valuable feedback.
Table of Contents
About the Author and About the Technical Reviewer
About the Author
Eric Goh Ming Hui
is a data scientist, software engineer, adjunct faculty, and entrepreneur with years of experience in multiple industries. His varied career includes data science, data and text mining, natural language processing, machine learning, intelligent system development, and engineering product design. Eric Goh has led teams in various industrial projects, including the advanced product code classification system project which automates Singapore Customs trade facilitation process and Nanyang Technological Universitys data science projects where he develop his own DSTK data science software. He has years of experience in C#, Java, C/C++, SPSS Statistics and Modeler, SAS Enterprise Miner, R, Python, Excel, Excel VBA, and more. He won the Tan Kah Kee Young Inventors Merit Award and was a Shortlisted Entry for TelR Data Mining Challenge. Eric Goh founded the SVBook website to offer affordable books, courses, and software in data science and programming.
He holds a Masters of Technology degree from the National University of Singapore, an Executive MBA degree from U21Global (currently GlobalNxt) and IGNOU, a Graduate Diploma in Mechatronics from A*STAR SIMTech (a national research institute located in Nanyang Technological University), and a Coursera Specialization Certificate in Business Statistics and Analysis from Rice University. He possesses a Bachelor of Science degree in Computing from the University of Portsmouth after National Service. He is also an AIIM Certified Business Process Management Master (BPMM), GSTF certified Big Data Science Analyst (CBDSA), and IES Certified Lecturer.
About the Technical Reviewer
Preeti Pandhu
has a Master of Science degree in Applied (Industrial) Statistics from the University of Pune. She is SAS certified as a base and advanced programmer for SAS 9 as well as a predictive modeler using SAS Enterprise Miner 7. Preeti has more than 18 years of experience in analytics and training. She started her career as a lecturer in statistics and began her journey into the corporate world with IDeaS (now a SAS company), where she managed a team of business analysts in the optimization and forecasting domain. She joined SAS as a corporate trainer before stepping back into the analytics domain to contribute to a solution-testing team and research/consulting team. She was with SAS for 9 years. Preeti is currently passionately building her analytics training firm, DataScienceLab ( www.datasciencelab.in ).
Eric Goh Ming Hui 2019
Eric Goh Ming Hui Learn R for Applied Statistics https://doi.org/10.1007/978-1-4842-4200-1_1
1. Introduction
In this book, you will use R for applied statistics, which can be used in the data understanding and modeling stages of the CRISP DM (data mining) model. Data mining is the process of mining the insights and knowledge from data. R programming was created for statistics and is used in academic and research fields. R programming has evolved over time and many packages have been created to do data mining , text mining, and data visualizations tasks. R is very mature in the statistics field, so it is ideal to use R for the data exploration, data understanding, or modeling stages of the CRISP DM model.