• Complain

Aurobindo Sarkar - Learning Spark SQL

Here you can read online Aurobindo Sarkar - Learning Spark SQL full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2017, publisher: Packt Publishing, genre: Computer / Science. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

No cover
  • Book:
    Learning Spark SQL
  • Author:
  • Publisher:
    Packt Publishing
  • Genre:
  • Year:
    2017
  • Rating:
    5 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 100
    • 1
    • 2
    • 3
    • 4
    • 5

Learning Spark SQL: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Learning Spark SQL" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Key Features
  • Learn about the design and implementation of streaming applications, machine learning pipelines, deep learning, and large-scale graph processing applications using Spark SQL APIs and Scala.
  • Learn data exploration, data munging, and how to process structured and semi-structured data using real-world datasets and gain hands-on exposure to the issues and challenges of working with noisy and dirty real-world data.
  • Understand design considerations for scalability and performance in web-scale Spark application architectures.
Book Description

In the past year, Apache Spark has been increasingly adopted for the development of distributed applications. Spark SQL APIs provide an optimized interface that helps developers build such applications quickly and easily. However, designing web-scale production applications using Spark SQL APIs can be a complex task. Hence, understanding the design and implementation best practices before you start your project will help you avoid these problems.

This book gives an insight into the engineering practices used to design and build real-world, Spark-based applications. The books hands-on examples will give you the required confidence to work on any future projects you encounter in Spark SQL.

It starts by familiarizing you with data exploration and data munging tasks using Spark SQL and Scala. Extensive code examples will help you understand the methods used to implement typical use-cases for various types of applications. You will get a walkthrough of the key concepts and terms that are common to streaming, machine learning, and graph applications. You will also learn how such systems are architected and deployed for a successful delivery of your project. Finally, you will move on to performance tuning, where you will learn practical tips and tricks to resolve performance issues.

What you will learn
  • Familiarize yourself with Spark SQL programming including working with DataFrame/Dataset API and SQL.
  • Perform a series of hands-on exercises with different types of data source including CSV, JSON, Avro, MySQL, and MongoDB.
  • Perform data quality checks, data visualization, and basic statistical analysis tasks.
  • Perform data munging tasks on publically available datasets.
  • Learn to use Spark SQL and SparkR for typical data science tasks.
  • Learn key performance-tuning tips and tricks in Spark SQL applications
  • Learn to identify cases where Spark SQL can be used in large-scale application architectures.

Aurobindo Sarkar: author's other books


Who wrote Learning Spark SQL? Find out the surname, the name of the author of the book and a list of all author's works by series.

Learning Spark SQL — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Learning Spark SQL" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Learning Spark SQL
Architect streaming analytics and machine learning solutions
Aurobindo Sarkar
BIRMINGHAM - MUMBAI Learning Spark SQL Copyright 2017 Packt Publishing All - photo 1

BIRMINGHAM - MUMBAI

Learning Spark SQL


Copyright 2017 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: August 2017

Production reference: 1010917

Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.

ISBN 978-1-78588-835-9

www.packtpub.com

Credits

Author

Aurobindo Sarkar

Copy Editor

Shaila Kusanale

Reviewer

Sumit Gupta

Project Coordinator

Ritika Manoj

Commissioning Editor

Kunal Parikh

Proofreader

Safis Editing

Acquisition Editor

Larissa Pinto

Indexer

Tejal Daruwale Soni

ContentDevelopmentEditor

Arun Nadar

Graphics

Jason Monteiro

Technical Editor

Shweta Jadhav

Production Coordinator

Shantanu Zagade

About the Author

Aurobindo Sarkar is currently the Country Head (India Engineering Center) for ZineOne Inc. With a career spanning over 24 years, he has consulted at some of the leading organizations in India, US, UK, and Canada. He specializes in real-time web-scale architectures, machine learning, deep learning, cloud engineering, and big data analytics. Aurobindo has been actively working as a CTO in technology start-ups for over 8 years now. As a member of the top leadership team at various start-ups, he has mentored founders and CxOs, provided technology advisory services, and led product architecture and engineering teams.

I would like to thank Packt for giving me the opportunity to write this book. Their patience, understanding, and support as I wrote, rewrote, revised, and improved upon the content of this book was massive in ensuring that the book remained current with the rapidly evolving versions of Spark.
I would especially like to thank Larissa Pinto, the acquisition editor (who first contacted me to write this book over a year ago) and Arun Nadar, the content development editor, who continuously, and patiently, worked with me to bring this book to a conclusion.
I would also like to thank my friends and colleagues who encouraged me throughout the journey.
Most of all, I want to thank my wife, Nitya, and kids, Somnath, Ravishankar, and Nandini, who understood, encouraged, and supported me, and sacrificed many family moments for me to be able to complete this book successfully. This one is for them
About the Reviewer

Sumit Gupta is a seasoned professional, innovator, and technology evangelist with over 100 months of experience in architecting, managing, and delivering enterprise solutions revolving around a variety of business domains, such as hospitality, healthcare, risk management, insurance, and more. He is passionate about technology and has an overall hands-on experience of over 16 years in the software industry. He has been using big data and cloud technologies over the last 5 years to solve complex business problems.

Sumit has also authored Neo4j Essentials, Building Web Applications with Python, and Neo4j, Real-Time Big Data Analytics, and Learning Real-time Processing with Spark Streaming, all by Packt.

You can find him on LinkedIn at sumit1001.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com . Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at service@packtpub.com for more details.

At www.PacktPub.com , you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

httpswwwpacktpubcommapt Get the most in-demand software skills with - photo 2

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?
  • Fully searchable across every book published by Packt
  • Copy and paste, print, and bookmark content
  • On demand and accessible via a web browser
Customer Feedback

Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at www.amazon.in/dp/1785888358.

If you'd like to join our team of regular reviewers, you can email us at customerreviews@packtpub.com. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!

Table of Contents
Preface

We will start this book with the basics of Spark SQL and its role in Spark applications. After the initial familiarization with Spark SQL, we will focus on using Spark SQL to execute tasks that are common to all big data projects, such as working with various types of data sources, exploratory data analysis, and data munging. We will also see how Spark SQL and SparkR can be leveraged to accomplish typical data science tasks at scale.

With the DataFrame/Dataset API and the Catalyst optimizer at the heart of Spark SQL, it is no surprise that it plays a key role in all applications based on the Spark technology stack. These applications include large-scale machine learning pipelines, large-scale graph applications, and emerging Spark-based deep learning applications. Additionally, we will present Spark SQL-based Structured Streaming applications that are deployed in complex production environments as continuous applications.

We will also review performance tuning in Spark SQL applications, including cost-based optimization

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Learning Spark SQL»

Look at similar books to Learning Spark SQL. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Learning Spark SQL»

Discussion, reviews of the book Learning Spark SQL and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.