Praise for Semantic Modeling for Data
Not only for Semantic Web and ML practitioners, this book illuminates the critical subject of how our clarity and precision in our language and thinking interoperate to make a tremendous impact on the fitness of our software. Highly recommended for analysts, architects, programmersanyone in software development.
Eben Hewitt, CTO and author of Semantic Software Design and Technology Strategy Patterns
Panoss clear-sighted text offers practical guidance and pragmatic advice to help you avoid the traps of vague, misleading, and just plain wrong semantic modeling.
Helen Lippell, taxonomy and semantics consultant, UK
Among the attempts to bring logic, ontology, and semiotics in information engineering, this book is probably one of the best and more complete sources.
Guido Vetere, CEO and cofounder at Isagog
Semantic Modeling for Data
by Panos Alexopoulos
Copyright 2020 Panos Alexopoulos. All rights reserved.
Printed in the United States of America.
Published by OReilly Media, Inc. , 1005 Gravenstein Highway North, Sebastopol, CA 95472.
OReilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com .
- Acquisitions Editor: Jonathan Hassell
- Development Editor: Michele Cronin
- Production Editor: Kate Galloway
- Copyeditor: Kim Cofer
- Proofreader: Piper Editorial, LLC
- Indexer: Ellen Troutman Zaig
- Interior Designer: David Futato
- Cover Designer: Karen Montgomery
- Illustrator: OReilly Media, Inc.
- September 2020: First Edition
Revision History for the First Edition
- 2020-08-19: First Release
See http://oreilly.com/catalog/errata.csp?isbn=9781492054276 for release details.
The OReilly logo is a registered trademark of OReilly Media, Inc. Semantic Modeling for Data, the cover image, and related trade dress are trademarks of OReilly Media, Inc.
The views expressed in this work are those of the author, and do not represent the publishers views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
978-1-492-05427-6
[LSI]
Preface
Knowledge graphs, ontologies, taxonomies, and other types of semantic data models have been developed and used in the data and artificial intelligence (AI) world for several decades. Their use captures the meaning of data in an explicit and shareable way, and enhances the effectiveness of data-driven applications. In the past decade, the popularity of such models has particularly increased. For example, the market intelligence company Gartner included knowledge graphs in its 2018 hype cycle for emerging technologies; and several prominent organizations like Amazon, LinkedIn, BBC, and IBM have been developing and using semantic data models within their products and services.
Behind this trend, there are two main driving forces:
Data-rich organizations increasingly realize that its not enough to have huge amounts of data. In order to derive value from it, you actually need this data to be clean, consistent, interconnected, and with clear semantics. This enables data scientists and business analysts to focus on what they do best: extracting useful insights from it. Semantic data modeling focuses exactly on tackling this challenge .
Developers and providers of AI applications increasingly realize that machine learning and statistical reasoning techniques are not always enough to build the intelligent behavior they need; complementing them with explicit symbolic knowledge can be necessary and beneficial. Semantic data modeling focuses exactly on building and providing such knowledge.
Several languages, methodologies, platforms, and tools are available for building semantic models, coming from different communities and focusing on different model aspects (e.g., representation, reasoning, storage, querying, etc.). However, the overall task of specifying, developing, putting in use, and evolving a semantic model is not as straightforward as one might think, especially as the models scope and scale increases. The reason is that human language and thinking is full of ambiguity, vagueness, imprecision, and other phenomena that make the formal and universally accepted representation of data semantics quite a difficult task.
This book shows you what semantic data modeling entails, and what challenges you have to face as a creator or user of semantic models. More importantly, it provides you with concrete advice on how to avoid dangers (pitfalls) and overcome obstacles (dilemmas). It teaches you some fundamental and enduring semantic modeling principles that remain true, no matter which particular framework or technology you are using, and shows you how you can apply these in your specific context.
After reading this book, you will be able to critically evaluate and make better use of existing semantic models and technologies, make informed decisions, and improve the quality and usability of the models you build.
Who Should Read This Book
This book is for data practitioners who develop or use semantic representations of data in their everyday jobs (knowledge engineers, information architects, data engineers, data scientists, etc.), and for whom the explicitness, accuracy, and common understandability of the datas meaning is an important dimension of their work.
You will find this book particularly useful if you recognize yourself in one or more of the following situations:
You are a taxonomist, ontologist, or other type of data modeler who knows a lot about semantic data modeling, though mostly from an academic and research perspective. You probably have a PhD or MSc in the field and excellent knowledge of modeling languages and frameworks, but you have had little chance to apply this knowledge in an industrial setting. You are now in the early stages of an industrial role and you have the opportunity to apply your knowledge to real-world problems. You have started realizing, though, that things are very different from what the academic papers and textbooks describe; the methods and techniques youve learned are not as applicable or effective as you thought. You face difficult situations for which there is no obvious decision to be made and, ultimately, the semantic models you develop are misunderstood, misapplied, or provide little added value. This book will help you put your valuable and hard-earned knowledge into practice and improve the quality of your work.
You are a data or information architect, tasked with developing semantic models that can solve the problem of semantic heterogeneity between the many disparate data sources and applications or products that your organization has. For that, you have already applied several out-of-the-box semantic data management solutions that promised seamless integration, but the results you got were mostly unsatisfactory. This book will help you to better understand the not so obvious dimensions and challenges you need to address in order to achieve the semantic interoperability you want.