LitArk » Books » Computer

Srinivas Duvvuri - Spark for Data Science

Here you can read online Srinivas Duvvuri - Spark for Data Science full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2016, publisher: Packt Publishing, genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Book:
Spark for Data Science
Author:
Srinivas Duvvuri / Bikramaditya Singhal
Publisher:
Packt Publishing
Genre:
Books / Computer
Year:
2016
Rating:
5 / 5
Favourites:
Add to favourites
Your mark:
- 100
- 1
- 2
- 3
- 4
- 5

Description
Author's other books
Similar books

Spark for Data Science: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Spark for Data Science" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Analyze your data and delve deep into the world of machine learning with the latest Spark version, 2.0

About This Book

Perform data analysis and build predictive models on huge datasets that leverage Apache Spark
Learn to integrate data science algorithms and techniques with the fast and scalable computing features of Spark to address big data challenges
Work through practical examples on real-world problems with sample code snippets

Who This Book Is For

This book is for anyone who wants to leverage Apache Spark for data science and machine learning. If you are a technologist who wants to expand your knowledge to perform data science operations in Spark, or a data scientist who wants to understand how algorithms are implemented in Spark, or a newbie with minimal development experience who wants to learn about Big Data Analytics, this book is for you!

What You Will Learn

Consolidate, clean, and transform your data acquired from various data sources
Perform statistical analysis of data to find hidden insights
Explore graphical techniques to see what your data looks like
Use machine learning techniques to build predictive models
Build scalable data products and solutions
Start programming using the RDD, DataFrame and Dataset APIs
Become an expert by improving your data analytical skills

In Detail

This is the era of Big Data. The words Big Data implies big innovation and enables a competitive advantage for businesses. Apache Spark was designed to perform Big Data analytics at scale, and so Spark is equipped with the necessary algorithms and supports multiple programming languages.

Whether you are a technologist, a data scientist, or a beginner to Big Data analytics, this book will provide you with all the skills necessary to perform statistical data analysis, data visualization, predictive modeling, and build scalable data products or solutions using Python, Scala, and R.

With ample case studies and real-world examples, Spark for Data Science will help you ensure the successful execution of your data science projects.

Style and approach

This book takes a step-by-step approach to statistical analysis and machine learning, and is explained in a conversational and easy-to-follow style. Each topic is explained sequentially with a focus on the fundamentals as well as the advanced concepts of algorithms and techniques. Real-world examples with sample code snippets are also included.

Srinivas Duvvuri: author's other books

Who wrote Spark for Data Science? Find out the surname, the name of the author of the book and a list of all author's works by series.

Spark for Data Science — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Spark for Data Science" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Spark for Data Science

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: September 2016

Production reference: 1270916

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-78588-565-5

www.packtpub.com

Credits

Authors

Srinivas Duvvuri

Bikramaditya Singhal

Copy Editors

Safis Editing

Reviewers

Daniel Frimer

Priyansu Panda

Yogesh Tayal

Project Coordinator

Kinjal Bari

Commissioning Editor

Dipika Gaonkar

Proofreader

Safis Editing

Acquisition Editors

Tushar Gupta

Nikhil Karkal

Indexer

Pratik Shirodkar

Content Development Editor

Rashmi Suvarna

Graphics

Kirk D'Penha

Technical Editor

Deepti Tuscano

Production Coordinator

Shantanu N. Zagade

Foreword

Apache Spark is one of the most popular projects in the Hadoop ecosystem and possibly the most actively developed open source project in big data. Its simplicity, performance, and flexibility have made it popular not only among data scientists but also among engineers, developers, and everybody else interested in big data.

With its rising popularity, Duvvuri and Bikram have produced a book that is the need of the hour, Spark for Data Science, but with a difference. They have not only covered the Spark computing platform but have also included aspects of data science and machine learning. To put it in one wordcomprehensive.

The book contains numerous code snippets that one can use to learn and also get a jump start in implementing projects. Using these examples, users also start to get good insights and learn the key steps in implementing a data science projectbusiness understanding, data understanding, data preparation, modeling, evaluation and deployment.

Venkatraman Laxmikanth

Managing Director

Broadridge Financial Solutions India (Pvt) Ltd

About the Authors

Srinivas Duvvuri is currently Senior Vice President Development, heading the development teams for Fixed Income Suite of products at Broadridge Financial Solutions (India) Pvt Ltd. In addition, he also leads the Big Data and Data Science COE and is the principal member of the Broadridge India Technology Council. He is self learnt Data Scientist. The Big Data /Data Science COE in the past 3 years, has successfully completed multiple POCs and some of the use cases are moving towards production deployment. He has over 25+ years of experience in software product development. His experience spans predominantly in product development in, multiple domains Financial Services, Infrastructure Management, OLAP, Telecom Billing and Customer Care, CAD/CAM. Prior to Broadridge, hes held leadership positions at a Startup and leading IT majors such as CA, Hyperion (Oracle), Globalstar. He has a patent in Relational OLAP.

Srinivas loves to teach and mentor budding Engineers. He has established strong Academic connect and interacts with a host of educational institutions, He is an active speaker in various conferences, summits and meetups on topics such as Big data, Data Science

Srinivas is a B.Tech in Aeronautical Engineering and M.Tech in Computer Science, from IIT, Madras.

At the outset I would like to thank VLK our MD and Broadridge India for supporting me in this endeavor. I would like to thank my parents, teachers, colleagues and extended family who have mentored and motivated me. My thanks to Bikram who agreed me to be the co-author when proposal to author the book came up. My special thanks to my wife Ratna, sons Girish and Aravind who have supported me in completing this book.

I would also like to sincerely thank the editorial team from Packt Arshriya, Rashmi, Deepti and all those, though not mentioned here, who have contributed in this project. Finally last but not the least our publisher Packt.

Bikramaditya Singhal is a data scientist with about 7 years of industry experience. He is an expert in statistical analysis, predictive analytics, machine learning, Bitcoin, Blockchain, and programming in C, R, and Python. He has extensive experience in building scalable data analytics solutions in many industry sectors. He also has an active interest on industrial IoT, machine to machine communication, decentralized computation through Blockchain and Artificial Intelligence.

Bikram currently leads the data science team of Digital Enterprise Solutions group at Tech Mahindra Ltd. He also worked in companies such as Microsoft India, Broadridge, Chelsio Communications and also cofounded a company named Mund Consulting which focused on Big Data analytics.

Bikram is an active speaker in various conferences, summits and meetups on topics such as big data, data science, IIoT and Blockchain.

I would like to thank my father, my brothers Manoj Agrawal and Sumit Mund for their mentorship. Without learning from them, there is not a chance I could be doing what I do today, and it is because of them and others that I feel compelled to pass my knowledge on to those willing to learn. Special thanks to my mentor and coauthor Srinivas Duvvuri, and my friend Priyansu Panda, without their efforts this book quite possibly would not have happened.

My deepest gratitude to his holiness Sri Sri Ravi Shankar for building me to what I am today. Many thanks and gratitude to my parents and my wife Yashoda for their unconditional love and support.

I would also like to sincerely thank all those, though not mentioned here, who have contributed in this project directly or indirectly.

About the Reviewers

Daniel Frimer has been involved in a vast exposure of industries across Healthcare, Web Analytics, Transportation. Across these industries has developed ways to optimize the speed of data workflow, storage, and processing in the hopes of making a highly efficient department. Daniel is currently a Masters candidate at the University of Washington in Information Sciences pursuing a specialization in Data Science and Business Intelligence. She worked on Python Data Science Essentials

Id like to thank my grandmother Mary. Who has always believed in mine and everyones potential and respects those whose passions make the world a better place.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Similar books «Spark for Data Science»

Look at similar books to Spark for Data Science. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.

Adi Polak

Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch

Hien Luu

Beginning Apache Spark 3: With DataFrame, Spark SQL, Structured Streaming, and Spark Machine Learning Library

Mahmoud Parsian

Data Algorithms with Spark: Recipes and Design Patterns for Scaling Up using PySpark

Jules S. Damji

Learning Spark: Lightning-Fast Data Analytics

Romeo Kienzler

Apache Spark 2: Data Processing and Real-Time Analytics: Master complex big data processing, stream analytics, and machine learning with Apache Spark

Javier Luraschi

Luraschi, J: Mastering Spark with R

Hien Luu

Beginning Apache Spark 2: With Resilient Distributed Datasets, Spark SQL, Structured Streaming and Spark Machine Learning library

Vitor Bianchi Lanzetta

Hands-On Data Science with R: Techniques to perform data manipulation and mining to build smart analytical models using R

Frank Kane [Frank Kane]

Hands-On Data Science and Python Machine Learning

Bikramaditya Singhal

Spark for Data Science

Siamak Amirghodsi

Apache Spark 2.x Machine Learning Cookbook

Rishi Yadav

Spark Cookbook

Reviews about «Spark for Data Science»

Discussion, reviews of the book Spark for Data Science and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.