Praise for Data Mesh
Zhamak impressively addresses what any data-driven company needs to implement data mesh at the culture, organizational, architectural, and technical levels to deliver value at scale sustainably. Anyone who works in data must have this book on their shelf and read it a couple of times. Like a good movie, they will discover new subtleties over and over again.
Andy Petrella, Founder of Kensu
In this book, Zhamak provides the nuance and detail that turns data mesh from a compelling idea into a viable strategy for putting data to work inside complex organizations.
Chris Ford, Head of Technology, Thoughtworks Spain
With Data Mesh, Zhamak was able to bring together the technical and organizational design practices that helped engineering organizations scale and package them in a way that makes sense for the data and analytics space.
Danilo Sato,Head of Data & AI for UK/Europe Thoughtworks
Data Mesh is a new concept, and data practitioners need to learn when and how to implement it. Zhamaks new book provides a good balance of theory and practice that can guide engineers on their path and enable them to make good design choices. The detailed approach in the book makes this new concept clear and useful. The best part, though, are the diagramssometimes a picture is worth more than 1,000 words.
Gwen Shapira,Cofounder and CPO at Nile Platform; Author of Kafka: The Definitive Guide
Data mesh is one of those concepts where we ask ourselves why we werent doing this all along. This book will be our guide.
Jesse Anderson,Managing Director Big Data Institute; Author of Data Teams
Few concepts have produced as much discussion in the data community as Data Mesh. In this book, Zhamak clearly lays out the principles and demystifies Data Mesh for the practitioner .
Julien Le Dem, Datakin CTO and OpenLineage project lead
A thorough and crucially needed overview of data as a product, including cultural, process, technology, and team changes required to get there. A data-driven organization needs a vibrant data mesh ecosystem where data products and teams emerge from business needs, not centralized data lakes and pipelines where data goes to.
Manuel Pais, Coauthor of Team Topologies
Its exciting watching a new paradigm appear. I watched microservices, and the underlying ideas, become mainstream and feel privileged to watch the building blocks of Data Mesh appear. But unlike many incremental improvements, Zhamak and her collaborators have transformed the underlying tensionbetween cross-cutting analytical data and the decoupling goals that make microservices appealinginto a transformative idea. What some may assume is merely a technical solution extends far beyond that, reconciling long-standing impedance problems between business needs and technology solutions. While complex, its multifaceted nature touches all parts of modern software development and points the way to many future innovations.
Neal Ford, Director/Software Architect/Meme Wrangler at Thoughtworks; Author of Software Architecture: The Hard Parts
In Data Mesh: Delivering Data-Driven Value at Scale, Zhamak shows a new paradigm encompassing technology and people to apply domain-driven thinking to analytics data and data products, thus enabling to derive value from data in an iterative manner.
Pramod Sadalage, Director, Data & DevOps, Thoughtworks
Data products invert the relationship between code and data, encapsulating the data and the code which serves the datawhat a clean way to distinguish between data products used in analytics and microservices used for operations.
Dr. Rebecca Parsons, CTO, Thoughtworks
Data Mesh
by Zhamak Dehghani
Copyright 2022 Zhamak Dehghani. All rights reserved.
Printed in the United States of America.
Published by OReilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
OReilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com .
- Acquisitions Editor: Melissa Duffield
- Development Editor: Gary OBrien
- Production Editor: Beth Kelly
- Copyeditor: Charles Roumeliotis
- Proofreader: Kim Wimpsett
- Indexer: Potomac Indexing, LLC
- Interior Designer: David Futato
- Cover Designer: Karen Montgomery
- Illustrator: Kate Dullea
- March 2022: First Edition
Revision History for the First Edition
- 2022-03-08: First Release
See http://oreilly.com/catalog/errata.csp?isbn=9781492092391 for release details.
The OReilly logo is a registered trademark of OReilly Media, Inc. Data Mesh, the cover image, and related trade dress are trademarks of OReilly Media, Inc.
The views expressed in this work are those of the author, and do not represent the publishers views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
This work is part of a collaboration between OReilly and Starburst Data. See our statement of editorial independence.
978-1-098-11276-9
[LSI]
To Dad
Your light remains
Foreword
Ive been involved in developing software for large corporations for several decades, and managing data has always been a major architectural issue. In the early days of my career, there was a lot of enthusiasm for a single enterprise-wide data model, often stored in a single enterprise-wide database. But we soon learned that having a plethora of applications accessing a shared data store was a disaster of ad-hoc coupling. Even without that, deeper problems existed. Core ideas to an enterprise, such as a customer, required different data models in different business units. Corporate acquisitions further muddied the waters.
As a response, wiser enterprises have decentralized their data, pushing data storage, models, and management into different business units. That way, the people who best understand the data in their domain are responsible for managing that data. They collaborate with other domains through well-defined APIs. Since these APIs can contain behavior, we have more flexibility for how that data is shared and more importantly, how we evolve data management over time.
While this has been increasingly the way to go for day-to-day operations, data analytics has remained a more centralized activity. Data warehouses aimed to provide an enterprise repository of curated critical information. But such a centralized group struggled with the work and its conflicting customers, particularly since they didnt have a good understanding of the data or the needs of its consumers. A data lake helped by popularizing access to raw data, allowing analysts to get closer to original source, but too easily became a data swamp of poor understanding and provenance.