Learn Data Analysis With Python In One Lesson
for beginners
Ashraf Awwad
To the shining memory of a buried hero
MAHMOUD AWWAD
My father
August 6, 2019
when I am at my best, I am my fathers son.
Table of Contents
Data Analysis in a nutshell
What is Data Analysis
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. - Wikipedia
Data analysis is about obtaining raw data and converting it into useful information for answering certain questions, for example, obtaining a monthly bank statement to understand where expenses are goes, also consider when you consult your physician, and when he asks many questions to help diagnose and recommend the right prescription, that why scientists believe data is the foundation of any decision-making.
Raw data means that the data in any shape or volume, and not ready to answer our desired questions. Also, data is collected and analyzed to test hypotheses or disprove theories. So, we should be ready to get data in any form and make it ready for analysis so we can get information or decision out of it, this what the data analyst does.
Why You Should Learn Data Analysis
Data analysis is an exciting and lucrative career, gaining the knowledge and skills will unlock so many other career directions and open the door for more opportunities like data science, engineering, and artificial intelligence.
Moreover, answering business questions and building a strong strategy based on data is a great advantage that will help businesses realize new market opportunities, better decision-making, and sustainable growth.
Data Analyst Role
Data analysts take data and produce meaningful actionable results. They do this by gathering large volumes of data, organizing, and turning it into insights businesses can use to make better decisions. They play a role in making decisions more scientific and helping businesses operate more effectively.
For example, a data analyst might study the data collected from thousands of customer purchases, reviews, and surveys, clean it up and analyze it to produce reports and graphs to pinpoint ways to improve productivity and increase revenues.
Applications of Data Analysis
Following are some of the popular applications:
Healthcare: Help predicts diseases and plans future medical services.
Marketing: Understand customer feedback and predict success factors.
Logistic: Optimize routes and fares.
Finance: Fraud detection & risk mitigation.
Type of Data Analysis
Based on business and technology, the major Data Analysis methods:
Descriptive , answers the what happened? by summarizing past data.
Diagnostic , drill down to answers why did it happen?.
Predictive , utilize previous data to answer what is likely to happen?.
Prescriptive , answers what actions to take? combines the insight from all previous analyses to aid in taking the right decision.
Steps of Data Analysis
Depending on the problem, here are some common steps:
1. Define Goals: The type of data and analysis depends on the objectives
2. Collect Data: Data is everywhere, and youll want to bring it into one
3. Clean Data: involve removing noise, redundant, duplicates
4. Integrate Analysis Tools: make the data run freely
5. Analyze Data: Choose the analysis type as per the goals
6. Visualize Data: BI tools to aggregate data to spot trends and patterns
7. Draw Conclusions: Looping into the data from every angle
Why Python for Data Analysis
With its unique features, simplicity, and readability, Python is the pioneer option for data analysis. Its a well-supported language with a broad array of helpful libraries and tons of communities and helpful materials. With rich visualization and analytics tools, many data mining companies over the globe utilize Python to reduce data.
Python Data Analysis Environment
Setting Up the environment
All the code in this book runs on Python 3 using Jupyter Notebook and a few third-party packages including NumPy, Pandas, and Matplotlib. If you dont want to install anything, you can also use Python on the cloud and access it through the web browser.
To install the packages, I recommend installing the free Anaconda Python Distribution, which includes everything that we will need by going to anaconda.com and download the Anaconda Installer
Scroll down the page then download the suitable file for your platform. Fire the setup by double-clicking the downloaded file and move through the installation wizard and accept the license and choose the right folder.
Once installed we can now try out our new python installation, by opening the Anaconda prompt and typing python. This gets you into the standard python shell, where you can write and execute code interactively. We can also verify that all the packages that we need are already installed by an attempt to import Numpy, pandas, and matplotlib.
To confirm that we have the Jupyter Notebook we can launch the Anaconda Navigator and click on Jupyter Notebook or, we can launch it by typing jupyter notebook in the console.
Before launching the notebook, change to a directory where you want to save and import the files.
We can open a new notebook by hitting New and choose Python 3.
Python Data Structures
This book assumes you already have basic Python knowledge, if not you can refer to my book Learn Python in One Day .
Python has built-in data structures namely List, Dictionary, Tuple, and Set. They allow us to organize, store and manage data, so we can perform any operations on them.
Sequences: Lists and tuples
lists are containers that hold objects such as numbers, strings, surrounded by brackets, comma-separated elements accessed by numerical index.
We can obtain the length of a list with len . Individual list elements can be accessed by index, starting with for the first element and ending at the length of the list minus one.
We can access the last element using the length minus index one or just use -1 , the minus index starts from the end and go down
We get values by using slicing, the first index is included, the last is not
To add a single element at the end of the list, we use append , and to add multiple elements in one go, we can use extend
To concatenate two lists, we use a plus ( + ) or can use the previous extend for concatenating to the original list