Published by:
2 Lindsley Road
Basking Ridge, NJ 07920 USA
https://www.TechnicsPub.com
Cover design by Manfred Christiansen
Edited by Lauren McCafferty
All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without written permission from the publisher, except for the inclusion of brief quotations in a review.
The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein.
All trade and product names are trademarks, registered trademarks, or service marks of their respective companies, and are the property of their respective holders and should be treated as such.
Copyright 2016 by Thomas Frisendal
ISBN, print ed. 9781634621212
ISBN, ePub ed. 9781634621236
First Printing 2016
Library of Congress Control Number: 2016948320
To my wife, Ellen-Margrethe, who fully supported me on this, my second book project. Being a writer herself, she was well aware of the implications!
Foreword By Karen Lopez
I started my data modeling and database career right out of university in the 1980s. This was around the time that data processing was undergoing a technological revolution: relational database systems (RDBMSs) were becoming increasingly present in enterprise environments. There was controversy, with Dr. Codd evaluating commercial products against his relational model, and vendors adding relational layers onto their pre-relational products.
It was both a confusing time and an exciting time to be entering the data profession.
At that time, data modeling was virtually unheard of in enterprise IT. When Information Engineering became popular in the late eighties and early nineties, data and process modeling were the de facto methods for designing database applications. Naturally, the logical data models for discussing business requirements used the same notation; it made sense to use a notation that mimicked relational tables. Entity Relationship Diagrams (ERDs) are still the most common method for expressing business and technical models. With the advent of data warehousing and business intelligence for read-focused database uses, we made some changes to data modeling methods, but these remained relational notations.
Fast forward all these decades and we in the data world are facing another revolution with Not-Only SQL (NoSQL) technologies. These solutions (often called schemaless) came with promises of no modeling required. Yet the IT world is figuring out that we still need logical and physical modeling. These models dont necessarily specify the structure of data, but they do describe the meaning and known features of data. We also have a perfect storm of open source projects, cloud technologies, and global collaboration. The result is more than a handful of candidate database solutions. In fact, we now have tens of thousands of database types, versions, and distributions to choose from. Even so, we still use tools and methods that express themselves in relational models.
In this book, Thomas Frisendal raises important questions about the continued usefulness of traditional data modeling notations and approaches:
- Are ERDs relevant to analytical data requirements?
- Are ERDs relevant in the new world of big data?
- Are ERDs still the best way to work with business users to understand their needs?
- Are Logical and Physical Data Models too closely coupled?
- Are we correct in using the same notations for communicating with business users and developers?
- Should we refine our existing notations and tools to meet these new needs, or should we start again from a blank page?
- What new notations and approaches will we need?
- How will we use those to build enterprise database systems?
Frisendal takes us through the history of data modeling, enterprise data models, and traditional modeling methods. He points outquite contentiouslywhere he feels we have gone wrong and a few places where we got it right. He then maps out the psychology of meaning and context, while identifying important issues about where data modeling may or may not fit in business modeling. The main subject of this work is a proposal for a new exploration-driven modeling approach and new modeling notations for business concept models , business solutions models, and physical data models, with examples on how to leverage these for implementation into any target database or data store. These new notations are based on a property graph approach to modeling data.
I have a feeling well be seeing more of these proposals for helping data professionals navigate the data revolution we are experiencing. Its an exciting time.
Karen Lopez, Data Evangelist
Love Your Data, www.datamodel.com
Chapter 1 Introduction
1.1. Motivation
I have worked with databases since they were first commercially available on a large scale. Along the road we tried a number of approaches, some of which were not exactly right the first time (seen in hindsight). For example: data modeling left the business realm to become an engineering activity. The real world ended up consisting of improper table designs, surrogate (internal) keys, and other pragmatic solutions to real problems.
Then I discovered concept mapping: a form of concept model which today has been successfully adopted by the business rules community. I also followed the graph world quite closely, but it was the advent of the property graph style of graphing that triggered me to write this book. My intent is to set a new standard for visualization of data models based on the property graph approach.
One of the major differences between relational and non-relational modeling is the absence of schemas (in the form of pre-defined metadata) existing alongside the data. Once upon a time, names and data types were determined once and rarely changed over time. The new world, with its general lack of schemas or self-describing data, changes this. Despite the fact that No in NoSQL stands for not only, most people associate it with Not SQL. In any case, the absence of schemas does not imply the absence of business requirements or the modeling of these requirementsa major theme of this book. We also will focus on the business requirements of good data modeling in schema-less contexts. Rather than modeling data architecture on complex mathematics, I believe we should focus on the psychology of the end user. If we do so, then engineering could be replaced with relevant business processes. In short, to achieve more logical and efficient results, we need to return to data modelings roots.