Foreword
Data integration has been the information systems professions most enduring challenge.
It is almost four decades since Richard Nolan nominated data administration as the penultimate stage of his data processing maturity model, recognizing that the development of applications to support business processes would, unless properly managed, create masses of duplicated and uncoordinated data.
In the early days of database technology, some of us had a dream that we could achieve Nolans objective by building all of our organizations databases in a coordinated manner to eliminate data duplication: Capture data once, store it in one place, and make it available to everyone who needs it was the mantra.
Decentralized computing, packaged software, and plain old self-interest put an end to that dream, but in many organizations the underlying ideas lived on in the form of data management initiatives based on planning and coordination of databasesnotably in the form of enterprise data models. Their success was limited, and organizations turned to tactical solutions to solve the most pressing problems. They built interfaces to transfer data between applications rather than capturing it multiple times, and they pulled it together for reporting purposes in what became data warehouses and marts. This pragmatic approach embodied a willingness to accept duplicated data as a given that was not attractive to the purists.
The tension between a strategic, organization-wide approach based on the disposition of data and after-the-fact spot solutions remains today. But the scale of the problem has grown beyond anything envisaged in the 1970s.
We have witnessed extraordinary advances in computing power, storage technology, and development tools. Information technology has become ubiquitous in business and government, and even midsized organizations count their applications in the thousands and their data in petabytes. But each new application, each new solution, adds to the proliferation of data. Increasingly, these solutions are off the shelf, offering the buyer little say in the database design and how it overlaps with existing and future purchases.
Not only has the number of applications exploded, but the complexity of the data within them is worlds away from the simple structures of early files and databases. The Internet and smartphones generate enormous volumes of less structured data, data embraces documents, audio and video, and cloud computing both extends the boundary of the organizations data and further facilitates acquisition of new applications.
The need for data integration has grown proportionatelyor more correctly, disproportionately, as the number of possible interfaces between systems increases exponentially. What was once an opportunistic activity is becoming, in many organizations, the focus of their systems development efforts.
The last decade has seen important advances in tools to support data integration through messaging and virtualization. This book fills a vital gap in providing an overview of this technology in a form that is accessible to nonspecialists: planners, managers, and developers. April Reeve brings a rare combination of business perspective and detailed knowledge from many years of designing, implementing, and operating applications for organizations as an IT technician, manager and, more recently, a consultant using the technologies in a variety of different environments.
Perhaps the most important audience will be data managers, in particular those who have stuck resolutely to the static data management model and its associated tools. As the management of data in motion comes to represent an increasing proportion of the information technology budget, it demands strategic attention, and data managers, with their organization-wide remit, are ideally placed to take responsibility. The techniques in this book now form the mainstream of data integration thinking and represent the current best hope of achieving the data administration goals Nolan articulated so long ago.
Graeme Simsion
Acknowledgements
First of all, I want to acknowledge the contribution of my husband, Tom Reeve, who said I had to acknowledge him for making me dinner. During the course of writing this book he made me dinner hundreds of times. Additionally, he put up with my constant mantra that I have to write instead of doing so many other things such as exercising or cleaning the house.