LitArk » Books » Computer

Jesse C. Daniel - Data Science with Python and Dask

Here you can read online Jesse C. Daniel - Data Science with Python and Dask full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. City: Shelter Island, NY, year: 2019, publisher: Manning Publications, genre: Computer / Science. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Book:
Data Science with Python and Dask
Author:
Jesse C Daniel
Publisher:
Manning Publications
Genre:
Computer / Science
Year:
2019
City:
Shelter Island, NY
Rating:
5 / 5
Favourites:
Add to favourites
Your mark:
- 100
- 1
- 2
- 3
- 4
- 5

Description
Author's other books
Similar books

Data Science with Python and Dask: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Data Science with Python and Dask" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Dask is a native parallel analytics tool designed to integrate seamlessly with the libraries youre already using, including Pandas, NumPy, and Scikit-Learn. With Dask you can crunch and work with huge datasets, using the tools you already have. And Data Science with Python and Dask is your guide to using Dask for your data projects without changing the way you work!About the TechnologyAn efficient data pipeline means everything for the success of a data science project. Dask is a flexible library for parallel computing in Python that makes it easy to build intuitive workflows for ingesting and analyzing large, distributed datasets. Dask provides dynamic task scheduling and parallel collections that extend the functionality of NumPy, Pandas, and Scikit-learn, enabling users to scale their code from a single laptop to a cluster of hundreds of machines with ease.About the BookData Science with Python and Dask teaches you to build scalable projects that can handle massive datasets. After meeting the Dask framework, youll analyze data in the NYC Parking Ticket database and use DataFrames to streamline your process. Then, youll create machine learning models using Dask-ML, build interactive visualizations, and build clusters using AWS and Docker.Whats inside Working with large, structured and unstructured datasets Visualization with Seaborn and Datashader Implementing your own algorithms Building distributed apps with Dask Distributed Packaging and deploying Dask appsAbout the ReaderFor data scientists and developers with experience using Python and the PyData stack.About the AuthorJesse Daniel is an experienced Python developer. He taught Python for Data Science at the University of Denver and leads a team of data scientists at a Denver-based media technology company.

Jesse C. Daniel: author's other books

Who wrote Data Science with Python and Dask? Find out the surname, the name of the author of the book and a list of all author's works by series.

Data Science with Python and Dask — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Data Science with Python and Dask" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Data Science with Python and Dask

Jesse C. Daniel

Manning

Shelter Island

For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity.

For more information, please contact

Special Sales Department

Manning Publications Co.

20 Baldwin Road

PO Box 761

Shelter Island, NY 11964

Email:

No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.

Recognizing the importance of preserving what has been written, it is Mannings policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine.

Manning Publications Co.

20 Baldwin Road

PO Box 761

Shelter Island, NY 11964

ISBN 9781617295607

Printed in the United States of America

contents in brief

To Clementine

preface

The data science community is such an interesting, dynamic, and fast-paced place to work. While my journey as a data scientist so far has only been around five years long, it feels as though Ive already seen a lifetime of tools, technologies, and trends come and go. One consistent effort has been a focus on continuing to make data science easier. Lowering barriers to entry and developing better libraries have made data science more accessible than ever. That there is such a bright, diverse, and dedicated community of software architects and developers working tirelessly to improve data science for everyone has made my experience writing Data Science with Python and Dask an incredibly humblingand at times intimidatingexperience. But, nonetheless, it is a great honor to be able to contribute to this vibrant community by showcasing the truly excellent work that the entire team of Dask maintainers and contributors have produced.

I stumbled across Dask in early 2016 when I encountered my first uncomfortably large dataset at work. After fumbling around for days with Hadoop, Spark, Ambari, ZooKeeper, and the menagerie of Apache big data technologies, I, in my exasperation, simply Googled big data library python. After tabbing through pages of results, I was left with two options: continue banging my head against PySpark or figure out how to use chunking in Pandas. Just about ready to call my search efforts futile, I spotted a StackOverflow question that mentioned a library called Dask. Once I found my way over to where Dask was hosted on GitHub, I started working my way through the documentation. DataFrames for big datasets? An API that mimics Pandas? It can be installed using pip? It seemed too good to be true. But it wasnt. I was incensedwhy hadnt I heard of this library before? Why was something this powerful and easy to use flying under the radar at a time when the big data craze was reaching fever pitch?

After having great success using Dask for my work project, I was determined to become an evangelist. I was teaching a Python for Data Science class at the University of Denver at the time, and I immediately began looking for ways to incorporate Dask into the curriculum. I also presented several talks and workshops at my local PyData chapters meetups in Denver. Finally, when I was approached by the folks at Manning to write a book on Dask, I agreed without hesitation. As you read this book, I hope you also come to see how awesome and useful Dask is to have in your arsenal of data science tools!

acknowledgments

As a new author, one thing I learned very quickly is that there are many, many people involved in producing a book. I absolutely would not have survived without all the wonderful support, feedback, and encouragement Ive received over the course of writing the book.

First, Id like to thank Stephen Soehnlen at Manning for approaching me with the idea to write this book, and Marjan Bace for green-lighting it. They took a chance on me, a first-time author, and for that I am truly appreciative. Next, a huge thanks to my development editor, Dustin Archibald, for patiently guiding me through Mannings writing and revising processes while also pushing me to become a better writer and teacher. Similarly, a big thanks to Mike Shepard, my technical editor, for sanity checking all my code and offering yet another channel of feedback. Id also like to thank Tammy Coron and Toni Arritola for helping to point me in the right direction early on in the writing process.

Next, thank you to all the reviewers who provided excellent feedback throughout the course of writing this book: Al Krinker, Dan Russell, Francisco Sauceda, George Thomas, Gregory Matuszek, Guilherme Pereira de Freitas, Gustavo Patino, Jeremy Loscheider, Julien Pohie, Kanak Kshetri, Ken W. Alger, Lukasz Tracewski, Martin Czygan, Pauli Sutelainen, Philip Patterson, Raghavan Srinivasan, Rob Koch, Romain Jouin, Ruairi O'Reilly, Steve Atchue, and Suresh Rangarajulu.. Special thanks as well to Ivan Martinovic for coordinating the peer review process and organizing all the feedback, and to Karsten Strbk for giving my code another pass before handing off to production.

Id also like to thank Bert Bates, Becky Rinehart, Nichole Beard, Matko Hrvatin and the entire graphics team at Manning, Chris Kaufmann, Ana Romac, Owen Roberts and the folks at Mannings marketing department, Nicole Butterfield, Rejhana Markanovic, and Lori Kehrwald. A big thank-you also goes out to Francesco Bianchi, Mike Stephens, Deirdre Hiam, Michelle Melani, Melody Dolab, Tiffany Taylor, and the countless other individuals who worked behind the scenes to make Data Science with Python and Dask a great success!

Finally, Id like to give a special thanks to my wife, Clementine, for her patient understanding on the many nights and weekends that I holed up in my office to work on the book. I couldnt have done this without your infinite love and support. I also wouldnt have had this opportunity without the inspiration of my dad to pursue a career in technology and the not-so-gentle nudging of my mom to do my English homework. I love you both!

about this book

Who should read this book

Data Science with Python and Dask takes you on a hands-on journey through a typical data science workflowfrom data cleaning through deploymentusing Dask. The book begins by presenting some foundational knowledge of scalable computing and explains how Dask takes advantage of those concepts to operate on datasets big and small. Building on that foundation, it then turns its focus to preparing, analyzing, visualizing, and modeling various real-world datasets to give you tangible examples of how to use Dask to perform common data science tasks. Finally, the book ends with a step-by-step walkthrough of deploying your very own Dask cluster on AWS to scale out your analysis code.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Similar books «Data Science with Python and Dask»

Look at similar books to Data Science with Python and Dask. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.

Matthew Rocklin

Dask: The Definitive Guide - Scalable Python Data Science with Dask (Early Release 1)

Holden Karau

Scaling Python with Dask

Daniel Chen

Pandas for Everyone: Python Data Analysis (Addison-Wesley Data & Analytics Series)

Tiago Antao

Bioinformatics with Python Cookbook: Use modern Python libraries and applications to solve real-world computational biology problems, 3rd Edition

McKinney

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython 2nd Edition

Daniel

Data science at scale with python and dask

Lanaro

Python High Performance - Second Edition

Nelli

Python Data Analytics: With Pandas, NumPy, and Matplotlib

Alvaro Fuentes

Become a Python Data Analyst: Perform exploratory data analysis and gain insight into scientific computing using Python

Steve Blair

Python Data Science: The Ultimate Handbook for Beginners on How to Explore NumPy for Numerical Data, Pandas for Data Analysis, IPython, Scikit-Learn and Tensorflow for Machine Learning and Business

Fabio Nelli

Python Data Analytics With Pandas, NumPy, and Matplotlib

Wes McKinney

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

Reviews about «Data Science with Python and Dask»

Discussion, reviews of the book Data Science with Python and Dask and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.