• Complain

Amit Nandi - Spark for Python Developers

Here you can read online Amit Nandi - Spark for Python Developers full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2015, publisher: Packt Publishing, genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Amit Nandi Spark for Python Developers
  • Book:
    Spark for Python Developers
  • Author:
  • Publisher:
    Packt Publishing
  • Genre:
  • Year:
    2015
  • Rating:
    3 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 60
    • 1
    • 2
    • 3
    • 4
    • 5

Spark for Python Developers: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Spark for Python Developers" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Key Features
  • Set up real-time streaming and batch data intensive infrastructure using Spark and Python
  • Deliver insightful visualizations in a web app using Spark (PySpark)
  • Inject live data using Spark Streaming with real-time events
Book Description

Looking for a cluster computing system that provides high-level APIs? Apache Spark is your answeran open source, fast, and general purpose cluster computing system. Sparks multi-stage memory primitives provide performance up to 100 times faster than Hadoop, and it is also well-suited for machine learning algorithms.

Are you a Python developer inclined to work with Spark engine? If so, this book will be your companion as you create data-intensive app using Spark as a processing engine, Python visualization libraries, and web frameworks such as Flask.

To begin with, you will learn the most effective way to install the Python development environment powered by Spark, Blaze, and Bookeh. You will then find out how to connect with data stores such as MySQL, MongoDB, Cassandra, and Hadoop.

Youll expand your skills throughout, getting familiarized with the various data sources (Github, Twitter, Meetup, and Blogs), their data structures, and solutions to effectively tackle complexities. Youll explore datasets using iPython Notebook and will discover how to optimize the data models and pipeline. Finally, youll get to know how to create training datasets and train the machine learning models.

By the end of the book, you will have created a real-time and insightful trend tracker data-intensive app with Spark.

What you will learn
  • Create a Python development environment powered by Spark (PySpark), Blaze, and Bookeh
  • Build a real-time trend tracker data intensive app
  • Visualize the trends and insights gained from data using Bookeh
  • Generate insights from data using machine learning through Spark MLLIB
  • Juggle with data using Blaze
  • Create training data sets and train the Machine Learning models
  • Test the machine learning models on test datasets
  • Deploy the machine learning algorithms and models and scale it for real-time events
About the Author

Amit Nandi studied physics at the Free University of Brussels in Belgium, where he did his research on computer generated holograms. Computer generated holograms are the key components of an optical computer, which is powered by photons running at the speed of light. He then worked with the university Cray supercomputer, sending batch jobs of programs written in Fortran. This gave him a taste for computing, which kept growing. He has worked extensively on large business reengineering initiatives, using SAP as the main enabler. He focused for the last 15 years on start-ups in the data space, pioneering new areas of the information technology landscape. He is currently focusing on large-scale data-intensive applications as an enterprise architect, data engineer, and software developer. He understands and speaks seven human languages. Although Python is his computer language of choice, he aims to be able to write fluently in seven computer languages too.

Table of Contents
  1. Setting Up a Spark Virtual Environment
  2. Building Batch and Streaming Apps with Spark
  3. Juggling Data with Spark
  4. Learning from Data Using Spark
  5. Streaming Live Data with Spark
  6. Visualizing Insights and Trends

Amit Nandi: author's other books


Who wrote Spark for Python Developers? Find out the surname, the name of the author of the book and a list of all author's works by series.

Spark for Python Developers — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Spark for Python Developers" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Spark for Python Developers

Spark for Python Developers

Copyright 2015 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: December 2015

Production reference: 1171215

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78439-969-6

www.packtpub.com

Credits

Author

Amit Nandi

Reviewers

Manuel Ignacio Franco Galeano

Rahul Kavale

Daniel Lemire

Chet Mancini

Laurence Welch

Commissioning Editor

Amarabha Banerjee

Acquisition Editor

Sonali Vernekar

Content Development Editor

Merint Thomas Mathew

Technical Editor

Naveenkumar Jain

Copy Editor

Roshni Banerjee

Project Coordinator

Suzanne Coutinho

Proofreader

Safis Editing

Indexer

Priya Sane

Graphics

Kirk D'Penha

Production Coordinator

Shantanu N. Zagade

Cover Work

Shantanu N. Zagade

About the Author

Amit Nandi studied physics at the Free University of Brussels in Belgium, where he did his research on computer generated holograms. Computer generated holograms are the key components of an optical computer, which is powered by photons running at the speed of light. He then worked with the university Cray supercomputer, sending batch jobs of programs written in Fortran. This gave him a taste for computing, which kept growing. He has worked extensively on large business reengineering initiatives, using SAP as the main enabler. He focused for the last 15 years on start-ups in the data space, pioneering new areas of the information technology landscape. He is currently focusing on large-scale data-intensive applications as an enterprise architect, data engineer, and software developer. He understands and speaks seven human languages. Although Python is his computer language of choice, he aims to be able to write fluently in seven computer languages too.

Acknowledgment

I want to express my profound gratitude to my parents for their unconditional love and strong support in all my endeavors.

This book arose from an initial discussion with Richard Gall, an acquisition editor at Packt Publishing. Without this initial discussion, this book would never have happened. So, I am grateful to him. The follow ups on discussions and the contractual terms were agreed with Rebecca Youe. I would like to thank her for her support. I would also like to thank Merint Mathew, a content editor who helped me bring this book to the finish line. I am thankful to Merint for his subtle persistence and tactful support during the write ups and revisions of this book.

We are standing on the shoulders of giants. I want to acknowledge some of the giants who helped me shape my thinking. I want to recognize the beauty, elegance, and power of Python as envisioned by Guido van Rossum. My respectful gratitude goes to Matei Zaharia and the team at Berkeley AMP Lab and Databricks for developing a new approach to computing with Spark and Mesos. Travis Oliphant, Peter Wang, and the team at Continuum.io are doing a tremendous job of keeping Python relevant in a fast-changing computing landscape. Thank you to you all.

About the Reviewers

Manuel Ignacio Franco Galeano is a software developer from Colombia. He holds a computer science degree from the University of Quindo. At the moment of publication of this book, he was studying to get his MSc in computer science from University College Dublin, Ireland. He has a wide range of interests that include distributed systems, machine learning, micro services, and so on. He is looking for a way to apply machine learning techniques to audio data in order to help people learn more about music.

Rahul Kavale works as a software developer at TinyOwl Ltd. He is interested in multiple technologies ranging from building web applications to solving big data problems. He has worked in multiple languages, including Scala, Ruby, and Java, and has worked on Apache Spark, Apache Storm, Apache Kafka, Hadoop, and Hive. He enjoys writing Scala. Functional programming and distributed computing are his areas of interest. He has been using Spark since its early stage for varying use cases. He has also helped with the review for the Pragmatic Scala book.

Daniel Lemire has a BSc and MSc in mathematics from the University of Toronto and a PhD in engineering mathematics from the Ecole Polytechnique and the Universit de Montral. He is a professor of computer science at the Universit du Qubec. He has also been a research officer at the National Research Council of Canada and an entrepreneur. He has written over 45 peer-reviewed publications, including more than 25 journal articles. He has held competitive research grants for the last 15 years. He has been an expert on several committees with funding agencies (NSERC and FQRNT). He has served as a program committee member on leading computer science conferences (for example, ACM CIKM, ACM WSDM, ACM SIGIR, and ACM RecSys). His open source software has been used by major corporations such as Google and Facebook. His research interests include databases, information retrieval and high-performance programming. He blogs regularly on computer science at http://lemire.me/blog/.

Chet Mancini is a data engineer at Intent Media, Inc in New York, where he works with the data science team to store and process terabytes of web travel data to build predictive models of shopper behavior. He enjoys functional programming, immutable data structures, and machine learning. He writes and speaks on topics surrounding data engineering and information architecture.

He is a contributor to Apache Spark and other libraries in the Spark ecosystem. Chet has a master's degree in computer science from Cornell University.

www.PacktPub.com
Support files, eBooks, discount offers, and more

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at > for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

httpswww2packtpubcombookssubscriptionpacktlib Do you need instant - photo 1
Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Spark for Python Developers»

Look at similar books to Spark for Python Developers. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Spark for Python Developers»

Discussion, reviews of the book Spark for Python Developers and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.