LitArk » Books » Politics

Sam R. Alapati - Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS

Here you can read online Sam R. Alapati - Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2016, publisher: Addison-Wesley Professional, genre: Politics. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Book:
Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS
Author:
Sam R Alapati
Publisher:
Addison-Wesley Professional
Genre:
Books / Politics
Year:
2016
Rating:
5 / 5
Favourites:
Add to favourites
Your mark:
- 100
- 1
- 2
- 3
- 4
- 5

Description
Author's other books
Similar books

Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

The Comprehensive, Up-to-Date Apache Hadoop Administration Handbook and Reference

Sam Alapati has worked with production Hadoop clusters for six years. His unique depth of experience has enabled him to write the go-to resource for all administrators looking to spec, size, expand, and secure production Hadoop clusters of any size.

Paul Dix, Series Editor

In Expert Hadoop Administration, leading Hadoop administrator Sam R. Alapati brings together authoritative knowledge for creating, configuring, securing, managing, and optimizing production Hadoop clusters in any environment. Drawing on his experience with large-scale Hadoop administration, Alapati integrates action-oriented advice with carefully researched explanations of both problems and solutions. He covers an unmatched range of topics and offers an unparalleled collection of realistic examples.

Alapati demystifies complex Hadoop environments, helping you understand exactly what happens behind the scenes when you administer your cluster. Youll gain unprecedented insight as you walk through building clusters from scratch and configuring high availability, performance, security, encryption, and other key attributes. The high-value administration skills you learn here will be indispensable no matter what Hadoop distribution you use or what Hadoop applications you run.

Understand Hadoops architecture from an administrators standpoint
Create simple and fully distributed clusters
Run MapReduce and Spark applications in a Hadoop cluster
Manage and protect Hadoop data and high availability
Work with HDFS commands, file permissions, and storage management
Move data, and use YARN to allocate resources and schedule jobs
Manage job workflows with Oozie and Hue
Secure, monitor, log, and optimize Hadoop
Benchmark and troubleshoot Hadoop

Sam R. Alapati: author's other books

Who wrote Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS? Find out the surname, the name of the author of the book and a list of all author's works by series.

Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

About This E-Book

EPUB is an open, industry-standard format for e-books. However, support for EPUB and its many features varies across reading devices and applications. Use your device or app settings to customize the presentation to your liking. Settings that you can customize often include font, font size, single or double column, landscape or portrait mode, and figures that you can click or tap to enlarge. For additional information about the settings and features on your reading device or app, visit the device manufacturers Web site.

Many titles include programming code or configuration examples. To optimize the presentation of these elements, view the e-book in single-column, landscape mode and adjust the font size to the smallest setting. In addition to presenting code and configurations in the reflowable text format, we have included images of the code that mimic the presentation found in the print book; therefore, where the reflowable format may compromise the presentation of the code listing, you will see a Click here to view code image link. Click the link to view the print-fidelity code image. To return to the previous page viewed, click the Back button on your device or app.

Expert Hadoop Administration

Managing, Tuning, and Securing Spark, YARN, and HDFS

Sam R. Alapati

Expert Hadoop Administration Managing Tuning and Securing Spark YARN and HDFS - image 1

Boston Columbus Indianapolis New York San Francisco Amsterdam Cape Town
Dubai London Madrid Milan Munich Paris Montreal Toronto Delhi Mexico City
So Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals.

The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein.

For information about buying this title in bulk quantities, or for special sales opportunities (which may include electronic versions; custom cover designs; and content particular to your business, training goals, marketing focus, or branding interests), please contact our corporate sales department at or (800) 382-3419.

For government sales inquiries, please contact .

For questions about sales outside the U.S., please contact .

Visit us on the Web: informit.com/aw

Library of Congress Control Number: 2016954056

All rights reserved. Printed in the United States of America. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. For information regarding permissions, request forms and the appropriate contacts within the Pearson Education Global Rights & Permissions Department, please visit www.pearsoned.com/permissions/.

ISBN-13: 978-0-13-459719-5
ISBN-10: 0-13-459719-2

1 16

To my cousin Alapati Srinath whom I consider my own brother Thank you - photo 2

To my cousin, Alapati Srinath, whom I consider my own brother. Thank you, Srinath, for your kindness, affection, and above all, graciousness, all of which have meant a lot to me over the years.

Foreword

A pache Hadoop 2 and the upcoming 3 were a major step forward in moving beyond the paradigm of MapReduce. At the core of this is the new YARN (Yet Another Resource Negotiator) processing framework for creating APIs and processing engines on top of Hadoop and HDFS, including the original MapReduce paradigm. Hadoop 2 is a significant upgrade to Hadoop 1, requiring updates to how a cluster is set up, managed and administered. This book provides everything a developer, operator or administrator would need to manage a production Hadoop 2 cluster of any size.

While Hadoop 2 and 3 at the core are HDFS and YARN, there are many other projects that are included in a typical production Hadoop cluster. For example, Hive, Pig, Spark, Flume and Kafka are often paired with the core Hadoop infrastructure to provide additional functionality and features. This book includes coverage of many of these complementary projects with introductory materials good for developers and administrators alike.

Sam Alapati is the principal Hadoop administrator at Sabre Holdings and has been working with production Hadoop clusters for the last six years. Hes uniquely qualified to cover the administration of production clusters and has pulled everything together in this single resource. The depth of experience that Sam brings to this book has enabled him to write much more than a simple introduction to Hadoop and Spark. While it does provide that introductory material, it will be the go-to resource for administrators looking to spec, size, expand and secure their production Hadoop clusters.

Paul Dix, Series Editor

Preface

A pache Hadoop is a popular open-source software framework for storing and processing large sets of data on a platform consisting of clusters of commodity hardware. The main idea behind Hadoop is to move computation to the data, instead of the traditional way of moving data to computation. Scalability lies at the heart of Hadoop, and one of the big reasons for its considerable popularity in the big data world we live in today is its extreme cost effectiveness owing to the use of commodity servers and open-source software.

I started working on this book in the fall of 2014. Hadoop 2 had come out a few months earlier, and there were numerous interesting changes in the Hadoop architecture in the new release. There was one very good book on administering generic (without the use of a third-party vendors tools) Hadoop clusters (Hadoop Operations by Eric Sammer), but, over time, it became outdated in several areas (it was published in 2012). Tom Whites book Hadoop: The Definitive Guide of course is wonderful, and it contains several useful discussions pertaining to Hadoop administration, but its a book more geared toward developers and architects than cluster administrators. I decided to write this book to provide Hadoop users a comprehensive guide to administering, securing, and optimizing their Hadoop clusters.

As I progressed with the book, Spark became the most important processing framework for Hadoop. I therefore added four chapters to discuss the architecture of Spark, the nature of Spark applications and how to manage and optimize Spark jobs running in a Hadoop cluster.

In this book, I explain how to manage, optimize, and secure Hadoop environments by working directly with the Hadoop configuration files. You may wonder if you really need to learn how to administer Hadoop from the ground level up. Like many of the people that manage Hadoop environments, I use third-party Hadoop distributions such as Cloudera and Hortonworks. Of course, using a tool such as Cloudera Manager or Apache Ambari to manage a Hadoop cluster makes your life really easy. However, I realized that in order to master Hadoop environments, and to get the most out of your Hadoop cluster, you must understand what actually happens behind the scenes when you work with a management tool to administer your cluster. This is possible only if you learn how to build a cluster from scratch and learn how to configure it for various purposeshigh availability, performance, security, encryptionas you go along.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Similar books «Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS»

Look at similar books to Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.

Roos

Hadoop for dummies: [Understand the value of big data and how Hadoop can help manage it ; navigate the Hadoop 2 ecosystem and create clusters ; use applications for data mining, problem-solving,

White

Hadoop

Havanki

Moving Hadoop to the cloud: harnessing cloud features and flexibility for Hadoop clusters

Prajapati

Big Data Analytics with R and Hadoop

Achari

Hadoop essentials: delve into the key concepts of Hadoop and get a thorough understanding of the Hadoop ecosystem

Lakhe

Practical Hadoop security

Mark Grover

Hadoop Application Architectures

Shiva Achari

Hadoop Essentials

Tom White

Hadoop: The Definitive Guide

Dirk deRoos

Hadoop For Dummies

Eric Sammer

Hadoop Operations

Garry Turkington

Hadoop beginners guide

Reviews about «Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS»

Discussion, reviews of the book Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.