inside front cover
Data mesh development elementsdata product development cycle details
Data Mesh in Action
Jacek Majchrzak, Sven Balnojan, and Marian Siwiak, with Mariusz Sieraczkiewicz
Foreword by Jean-Georges Perrin
To comment go to liveBook
Manning
Shelter Island
For more information on this and other Manning titles go to
www.manning.com
Copyright
For online information and ordering of these and other Manning books, please visit www.manning.com. The publisher offers discounts on these books when ordered in quantity.
For more information, please contact
Special Sales Department
Manning Publications Co.
20 Baldwin Road
PO Box 761
Shelter Island, NY 11964
Email:
2023 by Manning Publications Co. All rights reserved.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.
Recognizing the importance of preserving what has been written, it is Mannings policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine.
| Manning Publications Co. 20 Baldwin Road Technical PO Box 761 Shelter Island, NY 11964 |
Development editor: | Ian Hough |
Technical development editor: | Michael Jensen |
Review editor: | Adriana Sabo |
Production editor: | Andy Marinkovich |
Copy editor: | Sharon Wilkey |
Proofreader: | Keri Hales |
Technical proofreader: | Al Krinker |
Typesetter: | Gordan Salinovi |
Cover designer: | Marija Tudor |
ISBN: 9781633439979
front matter
foreword
The data mesh is to data as agile is to software engineering, or as microservices are to architecture patterns. It will be an essential component of your future data strategy. Data Mesh in Action addresses both the technology of the data mesh and the methodology your organization can follow to implement it.
This book teleports you into the seat of the chief architect on a data mesh project. The authors will coach you through the chaotic process of your first data product. As you gain more and more of those components, your mesh will build itself. The authors collective experience drives this transformation. Your responsibility will be to pick, choose, and adapt this framework to your needs and organization.
The data mesh is based on four key principles: domain ownership, data as a product, federated computational governance, and self-serve data platform. The book details organizational impact of these principles, as well as their technology, in great length. Individually, all those principles are well-known to engineers and architects; the real (r)evolution of the data mesh is its ability to combine them and deliver a global approach to building modern data platforms.
In my more than 15 years of building hybrid data platforms, I have always been missing something. Whether it was due to the strict approach of ingesting data in a warehouse or the lack of governance of a lake, to name two popular patterns, there was always this feeling of it aint gonna work. The mesh is different. It does not focus solely on technology; it puts governance and quality at the center and allocates ownership to the real owner, not some central commanding and demanding group. As a result, with adequate self-service tools, the data mesh will liberate the forces of innovation in your organization. And that is what this book will help you achieve.
Jean-Georges Perrin,
Intelligence platform lead at PayPal,
president and cofounder of AIDAUG,
and Lifetime IBM Champion
preface
Each one of us authors has experiencedat length and at different companiesthe old way of doing data, usually through centralized data lakes and data warehouses in combination with a set of central teams organized inside an analytics function. The old way basically looked like this:
Multiple decentralized development teams have data that is accessible through storage systems like a shared drive, a decentralized database, a Representational State Transfer (REST) API, or any other interface.
One or more centralized data teams are tasked with collecting this data into one monolithic pot. This is either a data lake or a data warehouse.
The same set of teams is tasked with transforming this data into something useful.
Multiple decentralized analysts, development teams, or machine learning (ML) teams pick up that transformed data and convert it into value in the form of reports, recommendation systems, or anything else they can think of.
We learned the hard way that this concept has its limits, producing a bottleneck in terms of both technology and team capacities. We all saw companies struggling to get the flow from data to value to be as productive as the companies needed it to be. Then the data mesh and the ideas behind it appeared on the horizon.
The data mesh is a decentralization paradigm. It decentralizes the ownership of data, its transformation into information, and its serving. It aims to increase the value extraction from data by removing bottlenecks in the data value stream by these means.
The concept of the data mesh appeared on the stage in 2019 and has since lit not just the data world, but the whole technology world, on fire. The data mesh concept breaks with the current world of data, which usually treats data as a by-product of software components. This new approach turns the spotlight on data producers and gives them the responsibility to handle the data just as they would handle their software.
With this, the data mesh takes the same journey software components have taken, with microservices architectures and with the DevOps movement. It takes the same journey frontends are currently taking with microfrontends. And just as in these examples, we believe that the data mesh is the right approach to finally gain the flexibility to extract value from our data at scale, be that in business intelligence (BI), ML learning, or any other use case you can think of.
The data mesh concept is often referred to as a socio-technical paradigm shift: its core is not about technology but about the alignment of people, processes, and organizations. This significant complexity is why we wrote this book. However, we dont just present the available theoretical knowledge that is out there; we focus on parts of the data mesh that are, in our experience, critical for successful implementation. We have organized those parts into a digestible resource to help you put a data mesh