Python
Data Science
The Ultimate Handbook for Beginners on How to Explore NumPy for Numerical Data, Pandas for Data Analysis, IPython, Scikit-Learn and Tensorflow for Machine Learning and Business
Steve Blair
Copyright
Copyright 2019 Steve Blair. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, without the prior written permission of the publisher.
Table of Contents
Disclaimer
The information contained in this eBook is offered for informational purposes solely, and it is geared towards providing exact and reliable information in regards to the topic and issue covered. Also, this eBook provides information only up to the publishing date.
The author and the publisher do not warrant that the information contained in this e-book is fully complete and shall not be responsible for any errors or omissions. The author and publisher shall have neither liability nor responsibility to any person or entity concerning any reparation, damages, or monetary loss caused or alleged to be caused directly or indirectly by this e-book. Therefore, this eBook should be used as a guide - not as the ultimate source.
The publication is sold with the idea that the publisher is not required to render accounting, officially permitted or otherwise qualified services. If advice is necessary, legal or professional, a practiced individual in the profession should be contacted.
In no way is it legal to reproduce, duplicate, or transmit any part of this document in either electronic means or printed format. Recording of this publication is strictly prohibited, and any storage of this document is not allowed unless with written permission from the publisher. All rights reserved.
The author owns all copyrights not held by the publisher. The trademarks that are used are without any consent, and the publication of the trademark is without permission or backing by the trademark owner. All trademarks and brands within this book are for clarifying purposes only and are not affiliated with this document.
Introduction
Welcome and thank you for purchasing this special guide on Python Data Science.
You have, no doubt, already experienced data science in one way or another. Obviously, you are interacting with data science products every time you search for information on the web by using search engines such as Google, or asking for directions with your mobile phone. Data science has been the force behind resolving some of our most common daily tasks for several years
Data science is the science and technology focused on collecting raw data and processing it in an effective manner. It is the combination of concepts and methods that make it possible to give meaning and understandability to huge volumes of data.
In nearly all of our daily work, we directly or indirectly work on storing and exchanging data. With the rapid development of technology, the need to store data effectively is also increasing. That's why it needs to be handled properly. Basically, data science unearths the hidden insights of raw-data and uses them for productive output.
Mt f th ntf mthd tht wr data n r nt nw. They hv bn ut thr for a long time, just waiting fr ltn t be dvld. Sttt is n ld n tht stands n th huldr f ghtnth-ntur gnt such as Prr Smn Ll (17491827) nd Thm B (17011761). Mhn Lrnng ungr, but t h lrd mvd bnd t nfn nd n b ndrd a wll-tblhd dln. Cmutr n hngd ur lv vrl dd g, nd ntnu t d ; but t cannot be ndrd nw.
Now that we understand the mrtn f dt n, the utn tht r
'How huld t b dn?'
The answer lies in dt n using the Pthn rgrmmng lngug.
Pthn mng th tmt lngug t this tm and it beating Jv in th dt n mrkt. Pthn n bjt-rntd rgrmmng lngug, and t h ftur whh make t mr ur frndl fr rgrmmng. Fr example- when using Python, w dn't nd different language to identify dt t, and there n nd to learn difficult ntx; w n ml wrt th d. It h mr funtn when compared t thr rgrmmng lngug.
Pthn a rgrmmng lngug that wrk for everythng frm data mnng t buldng wbt. Its easy to see that Python h grt value and utility n th dt n mrkt. Ann wh kng a future n th dt n ndutr should lrn Pthn.
Python Data Science teaches a complete course of data science, including key topics like data integration, data mining, python etc. We will explore NumPy for numerical data, Pandas for data analysis, IPython, Scikit-learn and Tensorflow for Machine Learning and business.
Lets get started!
Understanding Data Science
Frt, w will begin b discussing m f th tl that dt ntt u. Th tlbx of n dt ntt, fr n knd f rgrmmr, n ntl ngrdnt fr u nd nhnd rfrmn. Chng th rght tl n v a lt of tm, llwing u t fu n dt analysis.
Th mt b tl t dd n is whh rgrmmng lngug w wll u. Mn l u only n rgrmmng lngug n thr ntr lf, which is usually th frt nd nl n they lrn. Many see lrnng a nw lngug as an nrmu tk tht, f bl, huld b undrtkn nl n. Th rblm tht m lngug are ntndd fr dvlng hgh-rfrmn r rdutn d, uh C, C++, r Jv, whl thr r mr fud n rttng d. Amng th, th bt knwn are th -lld scripting lngug: Ruby, Prl, nd Pthn. Dndng n th frt lngug u lrnd, certain tk may seem rthr tdu at first. Remember, however, that even tedious tasks must be done properly, if success is to follow.
Th primary rblm f bng tuk wth a ngl lngug tht mn b tl ml wll nt b vlbl in t, nd vntull u wll hv to thr rmlmnt thm r rt a brdg so you can u some thr lngug fr a f tk. Yu thr hv t b rd t switch t th bt lngug fr h tk and thn somehow glu th rult tgthr, r choose a vr flxbl lngug wth a rh tm (.g., third-party n-ur lbrr). For th bk, we hv ltd Python th rgrmmng lngug, as it offers a great degree of flexibility for the data science programmer.
Wh Pthn?
Python a mtur rgrmmng lngug, but t l h xllnt properties fr nwb programmers, mkng t dl fr l wh hv nvr rgrmmd bfr. Sm f the mt rmrkbl of th rrt r -to-rd d, urn f nn-mndtr delimiters, dnm tng, and dnm mmr ug. Pthn n ntrrtd lngug, the d xutd mmdtl n th Pthn console wthut ndng th compilation t to mhn lngug. Besides th Pthn nl (whh m nludd wth n Pthn ntlltn), u n fnd thr ntrtv nl, uh IPthn, whh gv u a rhr nvrnmnt n whh t xut ur Pthn d.
Currntl, Pthn n f the mt flxbl rgrmmng languages. On of t main hrtrt tht mk t flxbl that t n b n a multrdgm lngug. Th ll useful fr l wh lrd knw hw t rgrm wth thr lngug, th n rdl trt rgrmmng wth Pthn n the same w. Fr xml, Jv rgrmmr wll fl mfrtbl ung Pthn, t urt th bjt-rntd rdgm, r C rgrmmr could mx Pthn nd C d using thn. Furthrmr, fr nn wh ud t rgrmmng n funtnl lngug uh Hkll r L, Pthn l h b ttmnt for funtnl rgrmmng n t wn r lbrr.
In th bk, w have ddd t focus on the Pthn lngug bu, xlnd earlier, t a mtur programming lngug, fr th nwb, nd n b ud a f ltfrm fr data ntt, thnk t its lrg ecosystem f ntf lbrr nd its vbrnt mmunt. Othr ulr ltrntv t Pthn fr dt ntt r R nd MATLAB/Otv.
Fundmntl Pthn Lbrr fr Dt Sntt
Th Pthn mmunt n of th mt tv rgrmmng mmunt, wth a huge numbr f dvld tlbx. Th mt ulr Pthn tlbx fr n dt ntt r NumP, SP, Pnd, nd Skt-Lrn.
Numeric nd Scientific Cmuttn: NumP nd SciPy
NumP th rnrtn tlbx fr ntf mutng wth Pthn. NumPy rvd, mng thr things, urt for multdmnnl rr wth b rtn nd useful lnr lgbr functions. Mn tlbx use th NumP rr rrnttn n ffnt b dt trutur. Meanwhile, SciPy rvd a lltn f numrl lgrthm nd dmn-f tlbx, nludng gnl rng, tmztn, statistics, and muh mr. Another r tlbx in SP th lttng library Matplotlib. Th tlbx h mn tl fr dt vulztn.