Book Description
Are you curious about data mining and analytics? Have you ever wondered how the world is filled with so much data and what is it used for? Do you wonder how companies use data to improve their revenue and steer themselves toward the path of success? Would you like to know why data is a new currency and companies invest so much to extract, store, and process data? Do you want to know how businesses leverage data to make millions of dollars in revenue? All of this can be achieved through data mining and analytics. If you are curious to know the answers to all these questions, this is the right book for you.
This book will break down the terms data and mining for you so you can understand the concepts individually and then as a whole. If you have been looking to launch yourself into the world of data mining and data analytics, this book will serve as the perfect launchpad. The book will introduce you to all the concepts of data mining and analytics in a very tailored manner. It doesnt matter if you are a beginner in the field of data or a veteran; you will have something important to take away from this book.
The tools and techniques described will teach you how data mining is used by organizations to steer themselves to success. This book will take you through:
An overview of data mining and the need for data mining today
Comparisons between data mining and data science
The tasks and issues in data mining
The terminologies used in data mining
The data mining query language
Classification, prediction, and cluster analysis in data mining
This book has been tailored for you to understand data mining and analytics. There are step-by-step guides with code syntaxes so that you can understand various data mining techniques. If youre looking for the ultimate guide in mining, analytics and metrics, then grab your copy today!
Data Mining and Analytics
Ultimate Guide to the Basics of Data Mining, Analytics and Metrics
Copyright 2020 - All rights reserved by Alex Campbell
The contents of this book may not be reproduced, duplicated or transmitted without direct written permission from the author.
Under no circumstances will any legal responsibility or blame be held against the publisher for any reparation, damages, or monetary loss due to the information herein, either directly or indirectly.
Legal Notice:
This book is copyright protected. This is only for personal use. You cannot amend, distribute, sell, use, quote or paraphrase any part or the content within this book without the consent of the author.
Disclaimer Notice:
Please note the information contained within this document is for educational and entertainment purposes only. Every attempt has been made to provide accurate, up to date and reliable complete information. No warranties of any kind are expressed or implied. Readers acknowledge that the author is not engaging in the rendering of legal, financial, medical or professional advice. The content of this book has been derived from various sources. Please consult a licensed professional before attempting any techniques outlined in this book.
By reading this document, the reader agrees that under no circumstances is the author responsible for any losses, direct or indirect, which are incurred as a result of the use of information contained within this document, including, but not limited to, errors, omissions, or inaccuracies.
Table of Contents
Introduction
We can understand data mining better if we split it into two words: Data and Mining. Understanding these two words individually will lay the foundation to understand Data Mining as a whole.
Information that is formatted and structured in a particular way is known as data. Today, we associate the term data with the domain of computing, mostly. The term program also draws parallels to the term data today. Data processed through a set of instructions are called programs. Data is available in all forms such as images, text, numbers, and can be stored on a piece of paper and digital media. However, in the 21st century, data mostly refers to information stored and transmitted through digital mediums.
When we talk about mining in general, it refers to the extraction of materials that are present deep inside the earth. Examples of mining are coal mining, gold mining, diamond mining, etc.
If we now merge both these terms, Data Mining in the field of computer science is the extraction of information from raw data sources that can be used for the benefit of a business or otherwise. Do not compare the term data mining to the general mining process as it can confuse you. When miners extract gold or diamonds from the surface of the earth, the result is gold and diamonds. However, the result of data mining is not data. The objective behind data mining is to extract information from raw data to recognize patterns that will give us insights about the respective data set belonging to a particular domain in life. This is why data mining is often referred to as Knowledge Extraction or Knowledge Discovery.
Gregory Piatetsky-Shapiro became the first person to associate the phrase Knowledge Discovery with Data Mining in 1989. As the years passed, the term Data Mining gained popularity. Today, however, the terms data mining and knowledge discovery are interchangeable.
If there is a process today where the requirement is to deal with huge sets of data, the first approach towards it is data mining. For example, Netflix will look at all the data they have on movies that you have already watched and will use it to suggest movies as per your liking to you. Websites like Amazon will look at your purchase and spend patterns and target you with similar products in the price range youre comfortable with.
Chapter One: Overview of Data Mining
Purpose of Data Mining
Raw data can be very confusing and almost useless. However, when information is extracted from raw data and organized and structured properly, it can reveal patterns and information that would otherwise be hidden. When a business can understand the historic patterns of a data set, it can leverage this information to predict future trends and behavior. Ultimately, this helps a business to improve its decision-making process.
Technically, data mining uses computing power to analyze data from all available sources, angles, perspectives, and dimensions and further classifying it such that it makes sense. Data mining has multiple applications such as data warehousing, transactional databases, relational databases, multimedia databases, and even the World Wide Web.
In short, data mining helps to classify data such that businesses can learn about the various trends and patterns in a data set to benefit the business. There are countless benefits of data mining. Some of them are risk management, fraud detection, spam mail filtering, marketing, etc. It can further be leveraged even to understand the sentiments of end customers.
Steps Involved in Data Mining
Let us quickly go through the various steps that are part of the data mining process.
The first step is to extract raw data, convert it, and store it in a data warehouse.
Data is transferred from the data warehouse to various databases so that it can be managed efficiently.
Provide access to the data to business analysts via dashboards.
Use data visualization techniques to represent huge data sets so that senior leaders and stakeholders of a business can understand data in one glance.
Let us go through the four steps mentioned above in detail.