Python for Data Science For Dummies
Published by: John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, www.wiley.com
Copyright 2015 by John Wiley & Sons, Inc., Hoboken, New Jersey
Media and software compilation copyright 2015 by John Wiley & Sons, Inc. All rights reserved.
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the Publisher. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions
.
Trademarks: Wiley, For Dummies, the Dummies Man logo, Dummies.com, Making Everything Easier, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and may not be used without written permission. Python is a registered trademark of Python Software Foundation Corporation. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY : THE PUBLISHER AND THE AUTHOR MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITUATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM. THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR A POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRITTEN AND WHEN IT IS READ.
For general information on our other products and services, please contact our Customer Care Department within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993, or fax 317-572-4002. For technical support, please visit www.wiley.com/techsupport
.
Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com
. For more information about Wiley products, visit www.wiley.com
.
Library of Congress Control Number: 2013956848
ISBN: 978-1-118-84418-2
ISBN 978-1-118-84398-7 (ebk); ISBN ePDF 978-1-118-84414-4 (ebk)
Introduction
You rely on data science absolutely every day to perform an amazing array of tasks or to obtain services from someone else. In fact, youve probably used data science in ways that you never expected. For example, when you used your favorite search engine this morning to look for something, it made suggestions on alternative search terms. Those terms are supplied by data science. When you went to the doctor last week and discovered the lump you found wasnt cancer, its likely the doctor made his prognosis with the help of data science. In fact, you might work with data science every day and not even know it. Python for Data Science For Dummies not only gets you started using data science to perform a wealth of practical tasks but also helps you realize just how many places data science is used. By knowing how to answer data science problems and where to employ data science, you gain a significant advantage over everyone else, increasing your chances at promotion or that new job you really want.
About This Book
The main purpose of Python for Data Science For Dummies is to take the scare factor out of data science by showing you that data science is not only really interesting but also quite doable using Python. You might assume that you need to be a computer science genius to perform the complex tasks normally associated with data science, but thats far from the truth. Python comes with a host of useful libraries that do all the heavy lifting for you in the background. You dont even realize how much is going on, and you dont need to care. All you really need to know is that you want to perform specific tasks and that Python makes these tasks quite accessible.
Part of the emphasis of this book is on using the right tools. You start with Anaconda, a product that includes IPython and IPython Notebook two tools that take the sting out of working with Python. You experiment with IPython in a fully interactive environment. The code you place in IPython Notebook is presentation quality, and you can mix a number of presentation elements right there in your document. Its not really like using a development environment at all.
You also discover some interesting techniques in this book. For example, you can create plots of all your data science experiments using MatPlotLib, for which this book provides you with all the details. This book also spends considerable time showing you just what is available and how you can use it to perform some really interesting calculations. Many people would like to know how to perform handwriting recognition and if youre one of them, you can use this book to get a leg up on the process.
Of course, you might still be worried about the whole programming environment issue, and this book doesnt leave you in the dark there, either. At the beginning, you find complete installation instructions for Anaconda and a quick primer (with references) to the basic Python programming you need to perform. The emphasis is on getting you up and running as quickly as possible, and to make examples straightforward and simple so that the code doesnt become a stumbling block to learning.
To make absorbing the concepts even easier, this book uses the following conventions:
- Text that youre meant to type just as it appears in the book is in bold. The exception is when youre working through a step list: Because each step is bold, the text to type is not bold.
- When you see words in italics as part of a typing sequence, you need to replace that value with something that works for you. For example, if you see Type Your Name and press Enter, you need to replace Your Name with your actual name.
- Web addresses and programming code appear in
monofont
. If youre reading a digital version of this book on a device connected to the Internet, note that you can click the web address to visit that website, like this: http://www.dummies.com
. - When you need to type command sequences, you see them separated by a special arrow, like this: FileNew File. In this case, you go to the File menu first and then select the New File entry on that menu. The result is that you see a new file created.
Next page