• Complain

Gorelik - The Enterprise Big Data Lake

Here you can read online Gorelik - The Enterprise Big Data Lake full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2019, publisher: OReilly Media, Inc., genre: Politics. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Gorelik The Enterprise Big Data Lake
  • Book:
    The Enterprise Big Data Lake
  • Author:
  • Publisher:
    OReilly Media, Inc.
  • Genre:
  • Year:
    2019
  • Rating:
    3 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 60
    • 1
    • 2
    • 3
    • 4
    • 5

The Enterprise Big Data Lake: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "The Enterprise Big Data Lake" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

The data lake is a daring new approach for harnessing the power of big data technology and providing convenient self-service capabilities. But is it right for your company? This book is based on discussions with practitioners and executives from more than a hundred organizations, ranging from data-driven companies such as Google, LinkedIn, and Facebook, to governments and traditional corporate enterprises. Youll learn what a data lake is, why enterprises need one, and how to build one successfully with the best practices in this book. Alex Gorelik, CTO and founder of Waterline Data, explains why old systems and processes can no longer support data needs in the enterprise. Then, in a collection of essays about data lake implementation, youll examine data lake initiatives, analytic projects, experiences, and best practices from data experts working in various industries. Get a succinct introduction to data warehousing, big data, and data science Learn various paths enterprises take to build a data lake Explore how to build a self-service model and best practices for providing analysts access to the data Use different methods for architecting your data lake Discover ways to implement a data lake from experts in different industries.

Gorelik: author's other books


Who wrote The Enterprise Big Data Lake? Find out the surname, the name of the author of the book and a list of all author's works by series.

The Enterprise Big Data Lake — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "The Enterprise Big Data Lake" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Praise for The Enterprise Big Data Lake Alex is a visionary in the data - photo 1
Praise for The Enterprise Big Data Lake

Alex is a visionary in the data industry. He has encapsulated his practical insights into a thorough treatise examining the technical considerations, firm-wide implications, and leveraged business impact of transitioning to a data-driven enterprise. This is a book for any business or technical professional who wishes to succeed with data.

Keyur Desai, Chief Data Officer, TD Ameritrade

Data lakes are essential in achieving many of the benefits of decision- and analytics-driven solutions. This book does a great job clarifying the architecture of data lakes, what value they provide, what challenges they pose, and how to address those challenges.

Jari Koister, VP of Product and Technology, FICO, and professor in the data science program at UC Berkeley, California

Big Data is one of the most confusing terms in the industry today. This book breaks down the components into easy, understandable terms and explains the best ways to approach such projects. I found the sections that articulate the interconnectedness of data streams, data ponds, and data lakes especially helpful. The book is a must-read for any executive looking to understand and educate themselves on contemporary methods of analytics.

Opinder Bawa, Vice President and Chief Information Officer, University of San Francisco

I cant wait to share this book with managers I know who have joined data lake teams and need an introduction to the tools and terms they will need to converse and understand their new teams. They will also get a great idea for the direction they should try and steer their teams. This book is a great place to start, whether you are building a data lake or have inherited one.

Nicole Schwartz, Agile and Technical Product Management consultant

The Enterprise Big Data Lake

by Alex Gorelik

Copyright 2019 Alex Gorelik. All rights reserved.

Printed in the United States of America.

Published by OReilly Media, Inc. , 1005 Gravenstein Highway North, Sebastopol, CA 95472.

OReilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com .

  • Editor: Andy Oram
  • Production Editor: Kristen Brown
  • Copyeditor: Rachel Head
  • Proofreader: Rachel Monaghan
  • Indexer: Ellen Troutman Zaig
  • Interior Designer: David Futato
  • Cover Designer: Karen Montgomery
  • Illustrator: Rebecca Demarest
  • March 2019: First Edition
Revision History for the First Edition
  • 2019-02-19: First Release

See http://oreilly.com/catalog/errata.csp?isbn=9781491931554 for release details.

The OReilly logo is a registered trademark of OReilly Media, Inc. The Enterprise Big Data Lake, the cover image, and related trade dress are trademarks of OReilly Media, Inc.

The views expressed in this work are those of the author, and do not represent the publishers views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.

978-1-491-93155-4

[LSI]

Preface

In recent years many enterprises have begun experimenting with using big data and cloud technologies to build data lakes and support data-driven culture and decision makingbut the projects often stall or fail because the approaches that worked at internet companies have to be adapted for the enterprise, and there is no comprehensive practical guide on how to successfully do that. I wrote this book with the hope of providing such a guide.

In my roles as executive at IBM and Informatica (major data technology vendors), Entrepreneur in Residence at Menlo Ventures (a leading VC firm), and founder and CTO of Waterline (a big data startup), Ive been fortunate to have had the opportunity to speak with hundreds of experts, visionaries, industry analysts, and hands-on practitioners about the challenges of building successful data lakes and creating a data-driven culture. This book is a synthesis of the themes and best practices that Ive encountered across industries (from social media to banking and government agencies) and roles (from chief data officers and other IT executives to data architects, data scientists, and business analysts).

Big data, data science, and analytics supporting data-driven decision making promise to bring unprecedented levels of insight and efficiency to everything from how we work with data to how we work with customers to the search for a cure for cancerbut data science and analytics depend on having access to historical data. In recognition of this, companies are deploying big data lakes to bring all their data together in one place and start saving history, so data scientists and analysts have access to the information they need to enable data-driven decision making. Enterprise big data lakes bridge the gap between the freewheeling culture of modern internet companies, where data is core to all practices, everyone is an analyst, and most people can code and roll their own data sets, and enterprise data warehouses, where data is a precious commodity, carefully tended to by professional IT personnel and provisioned in the form of carefully prepared reports and analytic data sets.

To be successful, enterprise data lakes must provide three new capabilities:

  • Cost-effective, scalable storage and computing, so large amounts of data can be stored and analyzed without incurring prohibitive computational costs

  • Cost-effective data access and governance, so everyone can find and use the right data without incurring expensive human costs associated with programming and manual ad hoc data acquisition

  • Tiered, governed access, so different levels of data can be available to different users based on their needs and skill levels and applicable data governance policies

Hadoop, Spark, NoSQL databases, and elastic cloudbased systems are exciting new technologies that deliver on the first promise of cost-effective, scalable storage and computing. While they are still maturing and face some of the challenges inherent to any new technology, they are rapidly stabilizing and becoming mainstream. However, these powerful enabling technologies do not deliver on the other two promises of cost-effective and tiered data access. So, as enterprises create large clusters and ingest vast amounts of data, they find that instead of a data lake, they end up with a data swampa large repository of unusable data sets that are impossible to navigate or make sense of, and too dangerous to rely on for any decisions.

This book guides readers through the considerations and best practices of delivering on all the promises of the big data lake. It discusses various approaches to starting and growing a data lake, including data puddles (analytical sandboxes) and data ponds (big data warehouses), as well as building data lakes from scratch. It explores the pros and cons of different data lake architectureson premises, cloud-based, and virtualand covers setting up different zones to house everything from raw, untreated data to carefully managed and summarized data, and governing access to those zones. It explains how to enable self-service so that users can find, understand, and provision data themselves; how to provide different interfaces to users with different skill levels; and how to do all of that in compliance with enterprise data governance policies.

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «The Enterprise Big Data Lake»

Look at similar books to The Enterprise Big Data Lake. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «The Enterprise Big Data Lake»

Discussion, reviews of the book The Enterprise Big Data Lake and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.