Data Scientist
Pocket Guide
Over Concepts, Terminologies, and
Processes of Machine Learning and
Deep Learning Assembled Together
Mohamed Sabri
www.bpbonline.com
FIRST EDITION 2021
Copyright BPB Publications, India
ISBN: 978-93-90684-97-7
All Rights Reserved. No part of this publication may be reproduced, distributed or transmitted in any form or by any means or stored in a database or retrieval system, without the prior written permission of the publisher with the exception to the program listings which may be entered, stored and executed in a computer system, but they can not be reproduced by the means of publication, photocopy, recording, or by any electronic and mechanical means.
LIMITS OF LIABILITY AND DISCLAIMER OF WARRANTY
The information contained in this book is true to correct and the best of authors and publishers knowledge. The author has made every effort to ensure the accuracy of these publications, but publisher cannot be held responsible for any loss or damage arising from any information in this book.
All trademarks referred to in the book are acknowledged as properties of their respective owners but BPB Publications cannot guarantee the accuracy of this information.
Distributors:
BPB PUBLICATIONS
20, Ansari Road, Darya Ganj
New Delhi-110002
Ph: 23254990/23254991
MICRO MEDIA
Shop No. 5, Mahendra Chambers,
DN Rd. Next to Capital Cinema,
V.T. (C.S.T.) Station, MUMBAI-400
Ph: 22078296/22078297
DECCAN AGENCIES
4-3-329, Bank Street,
Hyderabad-500195
Ph: 24756967/24756400
BPB BOOK CENTRE
Old Lajpat Rai Market,
Delhi-110006
Ph: 23861747
Published by Manish Jain for BPB Publications, Ansari Road, Darya Ganj, New Delhi-110002 and Printed by him at Repro India Ltd, Mumbai
www.bpbonline.com
Dedicated to
My father and our Sundays
About the Author
Mohamed the author of this book, completed his graduation in Mathematics and Economics from the University of Ottawa. He is a Managing Partner and Consultant in the field of Data Science and MLOps, and is working with the North American organizations in the Banking, Retail, and Gaming sector. With an irrefutable passion for Data Science, he is driven to do more for the domain by being involved in a range of innovative AI projects that help him deliver end-to-end solutions in the field of AI.
He drives his professional journey with his excellent communication skills and his expertise in Tech popularisation for complex projects. Building upon his commitment towards ensuring work and team cohesiveness, he has successfully executed several AI projects.
In his book, Data Scientist Pocket Guide, he has interestingly poured his secrets of becoming a benevolent data scientist.
His secret passion for connecting and networking with people and professionals is channelled through this book, that attempts to connect and reach several data scientists and make their everyday job enriching and easier.
About the Reviewer
Prateek Gupta is a Data Enthusiast and loves data-driven technologies. Prateek has done his B.Tech in Computer Science & Engineering and currently working as a Data Scientist in an IT company. Prateek has a total of years of experience in the software industry, and currently, he is working in the Computer Vision area. Prateek is also author of the book Practical Data Science with Jupyter nd Edition published by the BPB Publications.
Acknowledgements
The completion of this book could not have been possible without the support of BPB Publications. I would like to thank all the team members of BPB Publications; despite the COVID crisis, they extended their full support with access to all the resources that were critical in completing this book. I would like to thank my family for their support and encouragement while writing this book, with a special mention to my parents for being an incredible source of inspiration in my life. A huge thanks to my father for teaching me how to be patient and resilient in life. Lastly, I would like to underline the importance of patience in writing a book or facing any challenge in life.
Preface
At the beginning of my career as a data scientist, I use to go on search engines and use various sources to find explanations about a concept in data science. This was time consuming and the answers to my questions where not always reliable. It is hard for any data scientist to find quickly all the answers to his questions and sometimes answers vary from a source to another. Also, some concepts are hard to understand so you have to find a source that explains clearly what a concept means. This book is a first of a kind dictionary or glossary that regroups the most popular terms in data science. It helps data scientist from beginners to senior to look for definitions very quickly and have reliable answers to their questions. Usually books in data science focuses on coding and on practical use cases, whereas this book goal is to explain concepts and give a better idea to data scientist about what the words means. Its good to be able to code in data science and build machine learning models but if the data scientist doesnt understand the logic and the mechanism behind each concept it is hard for him to provide good results and explain its work. I hope you will keep this book as your Bible for data science and use it each time you have doubt about a concepts meaning. Have fun!
This book is separated into two sections. The first section is composed of chapters, each chapter correspond to a letter in the alphabet and a set of definitions in each chapter. The second section is an FAQ or frequently asked questions and it contains all the questions that a data scientist might have when it comes to data science, the questions covers some theorical parts and others are more practical such as should I learn R or Python?.
This book objective is not be read all at once but to become your data science Bible, so each time you might have a question about a concept and wondering how it works or what does it mean you might look at the book for answers. Also, this book is a good support for beginners that are always confused around all the concepts that they might find in data science. So, the lecture of this book is not linear you might start to read wherever you want and jump to any chapter based on the answers you are looking for. This book is a first of a kind in data science as no other book regroup as much terms in the field as this book does.