• Complain

Danil Zburivsky - Hadoop Cluster Deployment

Here you can read online Danil Zburivsky - Hadoop Cluster Deployment full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2013, publisher: Packt Publishing, genre: Home and family. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Danil Zburivsky Hadoop Cluster Deployment

Hadoop Cluster Deployment: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Hadoop Cluster Deployment" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Construct a modern Hadoop data platform effortlessly and gain insights into how to manage clusters efficiently

Overview

  • Choose the hardware and Hadoop distribution that best suits your needs
  • Get more value out of your Hadoop cluster with Hive, Impala, and Sqoop
  • Learn useful tips for performance optimization and security

In Detail

Big Data is the hottest trend in the IT industry at the moment. Companies are realizing the value of collecting, retaining, and analyzing as much data as possible. They are therefore rushing to implement the next generation of data platform, and Hadoop is the centerpiece of these platforms.

This practical guide is filled with examples which will show you how to successfully build a data platform using Hadoop. Step-by-step instructions will explain how to install, configure, and tie all major Hadoop components together. This book will allow you to avoid common pitfalls, follow best practices, and go beyond the basics when building a Hadoop cluster.

This book will walk you through the process of building a Hadoop cluster from the ground up. By using practical examples and command samples, you will be able to get a cluster up and running in no time, and you will also gain a deep understanding of how various Hadoop components work and interact with each other.

You will learn how to pick the right hardware for different types of Hadoop clusters and about the differences between various Hadoop distributions. By the end of this book, you will be able to install and configure several of the most popular Hadoop ecosystem projects including Hive, Impala, and Sqoop, and you will also be given a sneak peek into the pros and cons of using Hadoop in the cloud.

What you will learn from this book

  • Choose the optimal hardware configuration for your Hadoop cluster
  • Decipher the differences between various Hadoop versions and distributions
  • Make your cluster crash-proof with Namenode High Availability
  • Learn tips and tricks for Jobtracker, Tasktracker, and Datanodes
  • Discover the most important Hadoop ecosystem projects
  • Get more value out of your cluster by using SQL with Hive and real-time query processing with Impala
  • Set up a proper permissions model for your cluster
  • Secure Hadoop with Kerberos
  • Deploy a Hadoop cluster in a cloud environment

Approach

This book is a step-by-step tutorial filled with practical examples which will show you how to build and manage a Hadoop cluster along with its intricacies.

Who this book is written for

This book is ideal for database administrators, data engineers, and system administrators, and it will act as an invaluable reference if you are planning to use the Hadoop platform in your organization. It is expected that you have basic Linux skills since all the examples in this book use this operating system. It is also useful if you have access to test hardware or virtual machines to be able to follow the examples in the book.

Danil Zburivsky: author's other books


Who wrote Hadoop Cluster Deployment? Find out the surname, the name of the author of the book and a list of all author's works by series.

Hadoop Cluster Deployment — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Hadoop Cluster Deployment" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Hadoop Cluster Deployment

Hadoop Cluster Deployment

Copyright 2013 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: November 2013

Production Reference: 1181113

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78328-171-8

www.packtpub.com

Cover Image by Prashant Timappa Shetty (<>)

Credits

Author

Danil Zburivsky

Reviewers

Skanda Bhargav

Yanick Champoux

Cyril Ganchev

Alan Gardner

Acquisition Editor

Joanne Fitzpatrick

Commissioning Editor

Amit Ghodake

Technical Editors

Venu Manthena

Pramod Kumavat

Project Coordinator

Amey Sawant

Copy Editors

Kirti Pai

Lavina Pereira

Adithi Shetty

Aditya Nair

Proofreader

Linda Morris

Indexer

Monica Ajmera Mehta

Graphics

Ronak Dhruv

Production Coordinator

Manu Joseph

Cover Work

Manu Joseph

About the Author

Danil Zburivsky is a database professional with a focus on open source technologies. Danil started his career as a MySQL database administrator and is currently working as a consultant at Pythian, a global data infrastructure management company. At Pythian, Danil was involved in building a number of Hadoop clusters for customers in financial, entertainment, and communication sectors.

Danil's other interests include writing fun things in Python, robotics, and machine learning. He is also a regular speaker at various industrial events.

I would like to thank my wife for agreeing to sacrifice most of our summer evenings while I was working on the book. I would also like to thank my colleagues from Pythian, especially Alan Gardner, Cyril Ganchev, and Yanick Champoux, who contributed a lot to this project.

About the Reviewers

Skanda Bhargav is an Engineering graduate from Visvesvaraya Technological University, Belgaum, Karnataka, India. He did his majors in Computer Science and Engineering. He is currently employed with an MNC based out of Bangalore. Skanda is a Cloudera Certified developer in Apache Hadoop. His interests are Big Data and Hadoop.

I would like to thank my family for their immense support and faith in me throughout my learning stage. My friends have brought my confidence to a level that brings out the best in me. I am happy that God has blessed me with such wonderful people around me, without which this work might not have been as successful as it is today

Yanick Champoux is currently sailing the Big Data seas as a solutions architect. In his spare time, he hacks Perl, grows orchids, and writes comic books.

Cyril Ganchev is a system administrator, database administrator, and a software developer living in Sofia, Bulgaria. He received a master's degree in Computer Systems and Technologies from the Technical University of Sofia in 2005.

In 2002, he started working as a system administrator in an Internet Caf while studying at the Technical University of Sofia. In 2004, he began working as a software developer for the biggest Bulgarian IT company, Information Services Plc. He has been involved in many projects for the Bulgarian government, the Bulgarian National Bank, the National Revenue Agency, and others. He has been involved in several government elections in Bulgaria, writing the code that calculates the results.

Since 2012, he is working remotely for a Canadian company, Pythian. He started as an Oracle Database Administrator. In 2013, he transitioned to a newly formed team focused on Big Data and NoSQL.

Cyril Ganchev is an Oracle Advanced PL/SQL Developer Certified Professional and Oracle Database 11g Administrator Certified Associate.

I want to thank my parents for always supporting me, in all of my endeavors.

Alan Gardner is a solutions architect and developer specializing in designing Big Data systems. These systems incorporate technologies including Hadoop, Apache Kafka, and Storm, as well as Data Science techniques. Alan enjoys presenting his projects and shares his experience extensively at user groups and conferences. He also plays with functional programming and mobile and web development in his spare time.

Alan is also deeply involved in Ottawa's developer community, consulting with multiple organizations to help non-technical stakeholders organize developer events. With his group, Ottawa Drones, he runs hack days where developers can network, exchange ideas, and build their skills while programming flying robots.

I'd like to thank Paul White, Alex Gorbachev, and Mick Saunders for always helping me keep on the right path throughout different phases of my career, and Jasmin for always supporting me.

www.PacktPub.com
Support files, eBooks, discount offers and more

You might want to visit www.PacktPub.com for support files and downloads related to your book.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at >for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

httpPacktLibPacktPubcom Do you need instant solutions to your IT - photo 1

http://PacktLib.PacktPub.com

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can access, read and search across Packt's entire library of books.

Why Subscribe?
  • Fully searchable across every book published by Packt
  • Copy and paste, print and bookmark content
  • On demand and accessible via web browser
Free Access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view nine entirely free books. Simply use your login credentials for immediate access.

Preface

In the last couple of years, Hadoop has become a standard solution for building data integration platforms. Introducing any new technology into a company's data infrastructure stack requires system engineers and database administrators to quickly learn all the aspects of the new component. Hadoop doesn't make this task any easier because it is not a single software product, but it is rather a collection of multiple separate open source projects. These projects need to be properly installed and configured in order to make the Hadoop platform robust and reliable.

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Hadoop Cluster Deployment»

Look at similar books to Hadoop Cluster Deployment. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Hadoop Cluster Deployment»

Discussion, reviews of the book Hadoop Cluster Deployment and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.