• Complain

Gerard Maas - Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming

Here you can read online Gerard Maas - Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. City: Sebastopol, CA, year: 2019, publisher: O’Reilly Media, genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Gerard Maas Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming
  • Book:
    Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming
  • Author:
  • Publisher:
    O’Reilly Media
  • Genre:
  • Year:
    2019
  • City:
    Sebastopol, CA
  • Rating:
    4 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 80
    • 1
    • 2
    • 3
    • 4
    • 5

Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Before you can build analytics tools to gain quick insights, you first need to know how to process data in real time. With this practical guide, developers familiar with Apache Spark will learn how to put this in-memory framework to use for streaming data. Youll discover how Spark enables you to write streaming jobs in almost the same way you write batch jobs.Authors Gerard Maas and Franois Garillot help you explore the theoretical underpinnings of Apache Spark. This comprehensive guide features two sections that compare and contrast the streaming APIs Spark now supports: the original Spark Streaming library and the newer Structured Streaming API. Learn fundamental stream processing concepts and examine different streaming architectures Explore Structured Streaming through practical examples; learn different aspects of stream processing in detail Create and operate streaming jobs and applications with Spark Streaming; integrate Spark Streaming with other Spark APIs Learn advanced Spark Streaming techniques, including approximation algorithms and machine learning algorithms Compare Apache Spark to other stream processing projects, including Apache Storm, Apache Flink, and Apache Kafka Streams

Gerard Maas: author's other books


Who wrote Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming? Find out the surname, the name of the author of the book and a list of all author's works by series.

Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Stream Processing with Apache Spark by Gerard Maas and Franois Garillot - photo 1
Stream Processing with Apache Spark

by Gerard Maas and Franois Garillot

Copyright 2019 Franois Garillot and Gerard Maas Images. All rights reserved.

Printed in the United States of America.

Published by OReilly Media, Inc. , 1005 Gravenstein Highway North, Sebastopol, CA 95472.

OReilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com .

  • Acquisitions Editor: Rachel Roumeliotis
  • Developmental Editor: Jeff Bleiel
  • Production Editor: Nan Barber
  • Copyeditor: Octal Publishing Services, LLC
  • Proofreader: Kim Cofer
  • Indexer: Judith McConville
  • Interior Designer: David Futato
  • Cover Designer: Karen Montgomery
  • Illustrator: Rebecca Demarest
  • June 2019: First Edition
Revision History for the First Edition
  • 2019-06-12: First Release

See http://oreilly.com/catalog/errata.csp?isbn=9781491944240 for release details.

The OReilly logo is a registered trademark of OReilly Media, Inc. Stream Processing with Apache Spark, the cover image, and related trade dress are trademarks of OReilly Media, Inc.

The views expressed in this work are those of the authors, and do not represent the publishers views. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.

978-1-491-94424-0

[LSI]

Foreword

Welcome to Stream Processing with Apache Spark!

Its very exciting to see how much both the Apache Spark project, as well as stream processing with Apache Spark have come along since it was first started by Matei Zaharia at University of California Berkeley in 2009. Apache Spark started off as the first unified engine for big data processing and has grown into the de-facto standard for all things big data.

Stream Processing with Apache Spark is an excellent introduction to the concepts, tools, and capabilities of Apache Spark as a stream processing engine. This book will first introduce you to the core Spark concepts necessary to understand modern distributed processing. Then it will explore different stream processing architectures and the fundamental architectural trade-offs between then. Finally, it will illustrate how Structured Streaming in Apache Spark makes it easy to implement distributed streaming applications. In addition, it will also cover the older Spark Streaming (aka, DStream) APIs for building streaming applications with legacy connectors.

In all, this book covers everything youll need to know to master building and operating streaming applications using Apache Spark! We look forward to hearing about what youll build!

Tathagata Das

Cocreator of Spark Streaming and Structured Streaming

Michael Armbrust

Cocreator of Spark SQL and Structured Streaming

Bill Chambers

Coauthor of Spark: The Definitive Guide

May 2019

Preface
Who Should Read This Book?

We created this book for software professionals who have an affinity for data and who want to improve their knowledge and skills in the area of stream processing, and who are already familiar with or want to use Apache Spark for their streaming applications.

We have included a comprehensive introduction to the concepts behind stream processing. These concepts form the foundations to understand the two streaming APIs offered by Apache Spark: Structured Streaming and Spark Streaming.

We offer an in-depth exploration of these APIs and provide insights into their features, application, and practical advice derived from our experience.

Beyond the coverage of the APIs and their practical applications, we also discuss several advanced techniques that belong in the toolbox of every stream-processing practitioner.

Readers of all levels will benefit from the introductory parts of the book, whereas more experienced professionals will draw new insights from the advanced techniques covered and will receive guidance on how to learn more.

We have made no assumptions about your required knowledge of Spark, but readers who are not familiar with Sparks data-processing capabilities should be aware that in this book, we focus on its streaming capabilities and APIs. For a more general view of the Spark capabilities and ecosystem, we recommend Spark: The Definitive Guide by Bill Chambers and Matei Zaharia (OReilly).

The programming language used across the book is Scala. Although Spark provides bindings in Scala, Java, Python, and R, we think that Scala is the language of choice for streaming applications. Even though many of the code samples could be translated into other languages, some areas, such as complex stateful computations, are best approached using the Scala programming language.

Installing Spark

Spark is an Apache open source project hosted officially by theApache Foundation, but which mostly uses GitHub for itsdevelopment. You can also download it as a binary, pre-compiled package at the following address: https://spark.apache.org/downloads.html.

From there, you can begin running Spark on one or moremachines, which we will explain later. Packages exist for all of themajor Linux distributions, which should help installation.

For the purposes of this book, we use examples and codecompatible with Spark 2.4.0, and except forminor output and formatting details, those examples should staycompatible with future Spark versions.

Note, however, that Spark is a program that runs on the JavaVirtual Machine (JVM), which you should install and make accessibleon every machine on which any Spark component will run.

To install a Java Development Kit (JDK), we recommend OpenJDK,which is packaged on many systems and architectures, as well.

You can also install the Oracle JDK.

Spark, as any Scala program, runs on any system on which a JDKversion 6 or later is present. The recommended Java runtime for Spark depends on the version:

  • For Spark versions below 2.0, Java 7 is the recommended version.

  • For Spark versions 2.0 and above, Java 8 is the recommended version.

Learning Scala

The examples in this book are in Scala. This is theimplementation language of core Spark, but it is by far not the onlylanguage in which it can be used; as of this writing, Spark offers APIs inPython, Java, and R.

Scala is one of the most feature-complete programming languages today, inthat it offers both functional and object-oriented aspects. Yet, itsconcision and type inference makes the basic elements of its syntax easyto understand.

Scala as a beginner language has many advantages from a pedagogicalviewpoint, its regular syntax and semantics being one of the mostimportant.

Bjrn Regnell, Lund University

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming»

Look at similar books to Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming»

Discussion, reviews of the book Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.