Big Data For Beginners
Understanding SMART Big Data, Data Mining & Data Analytics For improved Business Performance, Life Decisions & More!
Copyright 2016 by Vince Reynolds - All rights reserved.
This document is geared towards providing exact and reliable information in regards to the topic and issue covered. The publication is sold with the idea that the publisher is not required to render accounting, officially permitted, or otherwise, qualified services. If advice is necessary, legal or professional, a practiced individual in the profession should be ordered.
- From a Declaration of Principles which was accepted and approved equally by a Committee of the American Bar Association and a Committee of Publishers and Associations.
In no way is it legal to reproduce, duplicate, or transmit any part of this document in either electronic means or in printed format. Recording of this publication is strictly prohibited and any storage of this document is not allowed unless with written permission from the publisher. All rights reserved.
The information provided herein is stated to be truthful and consistent, in that any liability, in terms of inattention or otherwise, by any usage or abuse of any policies, processes, or directions contained within is the solitary and utter responsibility of the recipient reader. Under no circumstances will any legal responsibility or blame be held against the publisher for any reparation, damages, or monetary loss due to the information herein, either directly or indirectly.
Respective authors own all copyrights not held by the publisher.
The information herein is offered for informational purposes solely, and is universal as so. The presentation of the information is without contract or any type of guarantee assurance.
The trademarks that are used are without any consent, and the publication of the trademark is without permission or backing by the trademark owner. All trademarks and brands within this book are for clarifying purposes only and are the owned by the owners themselves, not affiliated with this document.
Table of Contents
Introduction
If you are in the world of IT or business, you have probably heard about the Big Data phenomenon. You might have even encountered professionals who introduced themselves as data scientists. Hence, you are wondering, just what is this emerging new area of science? What types of knowledge and problem-solving skills do data scientists have? What types of problems are solved by data scientists through Big Data tech?
After reading book, you will have the answers to these questions. In addition, you will begin to become proficient with important industry terms and applications and tools in order to prepare you for a deeper understanding of the other important areas of Big Data.
Every day, our society is creating about 3 quintillion bytes of data. You are probably wondering what 3 quintillion is. Well, this is 3 followed by 18 zeros. And that folks is generated EVERY DAY. With this massive stream of data, the need to make sense of for this becomes more crucial and quickly increasing demand for Big Data understanding. Business owners, large or small, must have basic knowledge in big data.
Chapter 1 . A Conundrum Called Big Data
Big data is one of the latest technology trends that are profoundly affecting the way organizations utilize information to enhance the customer experience, improve their products and services, create untapped sources of revenue, transform business models and even efficiently manage health care services. What makes it a highly trending topic is the fact that the effective use of big data almost always ends up with significantly dramatic results. Yet, the irony though is nobody really knows what big data actually means.
There is no doubt that big data is not just a highly trending IT buzzword. Rather, it is a fast evolving concept in information technology and data management that is revolutionizing the way companies conduct their businesses. The sad part is, it is also turning out to be a classic conundrum because no one, not even a group of the best IT experts or computer geeks can come up with a definitive explanation describing exactly what it is. They always fall short of coming up with an appropriate description for big data that that is acceptable to all. At best, what most of these computer experts could come up with are roundabout explanations and sporadic examples to describe it. Try asking several IT experts what big data is and you will get just as many different answers as the number of people you ask.
What makes it even more complicated and difficult to understand is the fact that what is deemed as big now may not be that big in the near future due to rapid advances in software technology and the data management systems designed to handle them.
We also cannot escape the fact that we now live in a digital universe where everything and anything we do leaves a digital trace we call data. At the center of this digital universe is the World Wide Web from which comes a deluge of data that floods our consciousness every single second. With well over one trillion web pages ( 50 billion of which have already been indexed by and are searchable through various major search engines ), the web offers us unparalleled interconnectivity which allows us to interact with anyone and anything within a connected network we happen to be part of. Each one of these interactions generates data too that is coursed through and recorded in the web - adding up to the fuzziness of an already fuzzy concept. As a consequence, the web is continuously overflowing with massive data so huge that it is almost impossible to digest or crunch into usable segments for practical applications if they are of any use at all. This enormous, ever growing data that goes through and are stored in the web together with the developing technologies designed to handle it is what is collectively referred to as big data.
So, What Does Big Data Look Like?
If you want to have an idea on what big data really looks like or how massive it truly is, try to visualize the following statistics if you can - without getting dizzy. Think of the web which currently covers more than 100 million domains and is still growing at the rate of 20,000 new domains every single day.
The data that comes from these domains is so massive and mind boggling that it is practically immeasurable much less manageable by any conventional data management and retrieval methods that are available today. And that is only for our starters. Add to this the 300 million daily Facebook posts, 60 million daily Facebook updates, and 250 million daily tweets coming from more than 900 million combined Facebook and Tweeter users and for sure your imagination is going to go through the roof. Dont forget to include the voluminous data coming from over six billion smart phones currently in use today which continually access the internet to do business online, to post status updates on social media, send out tweets, and many other digital transactions. Remember, approximately one billion of these smart phones are GPS enabled which means they are constantly connected to the internet and therefore, they are continuously leaving behind their digital trails which is adding more data to the already burgeoning bulk of information already stored in millions of servers that span the internet.
Next page