• Complain

Curtis Miller [Curtis Miller] - Hands-On Data Analysis with NumPy and pandas

Here you can read online Curtis Miller [Curtis Miller] - Hands-On Data Analysis with NumPy and pandas full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2018, publisher: Packt Publishing, genre: Home and family. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Curtis Miller [Curtis Miller] Hands-On Data Analysis with NumPy and pandas

Hands-On Data Analysis with NumPy and pandas: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Hands-On Data Analysis with NumPy and pandas" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Get to grips with the most popular Python packages that make data analysis possible

About This Book
  • Explore the tools you need to become a data analyst
  • Discover practical examples to help you grasp data processing concepts
  • Walk through hierarchical indexing and grouping for data analysis
Who This Book Is For

Hands-On Data Analysis with NumPy and Pandas is for you if you are a Python developer and want to take your first steps into the world of data analysis. No previous experience of data analysis is required to enjoy this book.

What You Will Learn
  • Understand how to install and manage Anaconda
  • Read, sort, and map data using NumPy and pandas
  • Find out how to create and slice data arrays using NumPy
  • Discover how to subset your DataFrames using pandas
  • Handle missing data in a pandas DataFrame
  • Explore hierarchical indexing and plotting with pandas
In Detail

Python, a multi-paradigm programming language, has become the language of choice for data scientists for visualization, data analysis, and machine learning.

Hands-On Data Analysis with NumPy and Pandas starts by guiding you in setting up the right environment for data analysis with Python, along with helping you install the correct Python distribution. In addition to this, you will work with the Jupyter notebook and set up a database. Once you have covered Jupyter, you will dig deep into Pythons NumPy package, a powerful extension with advanced mathematical functions. You will then move on to creating NumPy arrays and employing different array methods and functions. You will explore Pythons pandas extension which will help you get to grips with data mining and learn to subset your data. Last but not the least you will grasp how to manage your datasets by sorting and ranking them.

By the end of this book, you will have learned to index and group your data for sophisticated data analysis and manipulation.

Style and approach

A step-by-step approach, taking you through the different concepts and features of Data Analysis using Python libraries and tools.

Downloading the example code for this book You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

Curtis Miller [Curtis Miller]: author's other books


Who wrote Hands-On Data Analysis with NumPy and pandas? Find out the surname, the name of the author of the book and a list of all author's works by series.

Hands-On Data Analysis with NumPy and pandas — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Hands-On Data Analysis with NumPy and pandas" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Indexing methods

pandas provides methods that allow us to clearly state how we want to index. We can also distinguish between indexing based on values of the index of the series, and indexing based on the position of objects in the series, as would be the case if we were working with a list. The two methods we'll focus on are loc and iloc. loc focuses on selecting based on the index of the series, and if we try to select key elements that don't exist, we will get an error. iloc indexes as if we were working with a Python list; that is, it indexes based on integer position. So, if we were to try to index with a non-integer in iloc, or try to select an element outside of the range of valid integers, an error will be produced. There is a hybrid method, ix, that acts like loc, but if passed input that cannot be interpreted with respect to the index, it will act like iloc. Because of the ambiguity about how ix will behave, I recommend sticking with loc or iloc most of the time.

Let's return to our example. It turns out that square brackets, in this case, index like iloc; that is, they index based on integer position as if srs2 were a list. If we wanted to index based on the index of srs2, we could use loc to do so, getting the other possible result. Again, notice that in this case, both endpoints were included. This is unlike the behavior we normally associate with the colon operator:

Who this book is for If you are a Python developer and want to take your first - photo 1
Who this book is for

If you are a Python developer and want to take your first steps into the world of data analysis, then this is the book you have been waiting for!

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Hands-On Data Analysis with NumPy and pandas

Copyright 2018 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor: Sunith Shetty
Acquisition Editor: Tushar Gupta
Content Development Editor: Prasad Ramesh
Technical Editor: Sagar Sawant
Copy Editor: Vikrant Phadke
Project Coordinator: Nidhi Joshi
Proofreader: Safis Editing
Indexer: Rekha Nair
Graphics: Jisha Chirayil
Production Coordinator: Shraddha Falebhai

First published: June 2016

Production reference: 1280618

Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.

ISBN 978-1-78953-079-7

www.packtpub.com

Index sorting

When talking about sorting, we need to think about what exactly we are sorting. There are rows, columns, their indices, and the data they contain. Let's first look at index sorting. We can use the sort_index method to rearrange the rows of a DataFrame so that the row indices are in order. We can also sort the columns by setting the access parameter of sort_index to 1. By default, sorting is done in ascending order; later rows have larger values than earlier rows, but we can change this behavior by setting the ascending value of the sort_index value to false. This sorts in descending order. By default, this is not done in place; you need to set the in place argument of sort_index to true for that.

While I have emphasized sorting for DataFrames, sorting a series is effectively the same. Let's see an example. After loading in NumPy and pandas, we create a DataFrame with values to sort, shown in the following screenshot:

Lets sort the index notice that this is not done in place Lets sort the - photo 2

Let's sort the index; notice that this is not done in place:

Lets sort the columns this time and we will do them in reverse order by - photo 3

Let's sort the columns this time , and we will do them in reverse order by setting ascending=False ; so the first column is now CCC and the last is AAA, shown as follows:

What this book covers Setting Up a Python Data Analysis Environment - photo 4
What this book covers

, Setting Up a Python Data Analysis Environment , discusses installing Anaconda and managing it. Anaconda is a software package we will use in the following chapters of this book.

, Diving into NumPY , discusses NumPy data types controlled by dtype objects, which are the way NumPy stores and manages data.

, Operations on NumPy Arrays , will cover what every NumPy user should know about array slicing, arithmetic, linear algebra with arrays, and employing array methods and functions.

, pandas are Fun! What is pandas? , introduces pandas and looks at what it does. We explore pandas series, DataFrames, and creating them.

, Arithmetic, Function Application, and Mapping with pandas , revisits some topics discussed previously, regarding applying functions in arithmetic to a multivariate object and handling missing data in pandas.

, Managing, Indexing, and Plotting , looks at sorting and ranking. We'll see how to achieve this in pandas, looking at hierarchical indexing and plotting with pandas.

Sorting by values

If we wish to sort the rows of a DataFrame or the elements of a series, we need to use the sort_values method. For a series, you'd call sort_values and call it a day. For a DataFrame though, you would need to set the by parameter; you can set by to a string, indicating the column you want to sort by, or to a list of strings, indicating column names. Sorting will first proceed according to the first column in this list; then, when ties appear, sorting will be according to the next column, and so on.

So, let's demonstrate some of these sorting techniques. We sort the values of the DataFrame according to the column AAA, shown in the following screenshot:

Notice that all the entries in AAA are now in order though not much can be - photo 5
Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Hands-On Data Analysis with NumPy and pandas»

Look at similar books to Hands-On Data Analysis with NumPy and pandas. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Hands-On Data Analysis with NumPy and pandas»

Discussion, reviews of the book Hands-On Data Analysis with NumPy and pandas and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.