Neo4j in Action
Aleksa Vukotic and Nicki Watt with Tareq Abedrabbo, Dominic Fox, and Jonas Partner
Copyright
For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact
Special Sales Department Manning Publications Co. 20 Baldwin Road PO Box 761 Shelter Island, NY 11964 Email:
orders@manning.com2015 by Manning Publications Co. All rights reserved.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.
Recognizing the importance of preserving what has been written, it is Mannings policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine.
| Manning Publications Co.20 Baldwin RoadPO Box 761Shelter Island, NY 11964 | Development editor: Karen MillerTechnical development editor Gordon DickensCopyeditor: Andy CarrollProofreader: Elizabeth MartinTechnical proofreader: Craig TavernerTypesetter: Dennis DalinnikCover designer: Marija Tudor |
ISBN: 9781617290763
Printed in the United States of America
1 2 3 4 5 6 7 8 9 10 EBM 19 18 17 16 15 14
Brief Table of Contents
Table of Contents
Foreword
The database world is experiencing an enormous upheaval, with the hegemony of relational databases being challenged by a plethora of new technologies under the NoSQL banner. Among these approaches, graphs are gaining substantial credibility as a means of analyzing data across a broad range of domains.
Most NoSQL databases address the perceived performance limitations of relational databases, which flounder when confronted with the exponential growth in data volumes that weve witnessed over the last few years. But data growth is only one of the challenges we face. Not only is data growing, its also becoming more interconnected and more variably structured. In short, its becoming far more networked.
In addressing performance and scalability, NoSQL has generally given up on the capabilities of the relational model with regard to interconnected data. Graph databases, in contrast, revitalize the world of connected data, outperforming relational databases by several orders of magnitude. Many of the most interesting questions we want to ask of our data require us to understand not only that things are connected, but also the differences between those connections. Graph databases offer the most powerful and best-performing means for generating this kind of insight.
Connected data poses difficulties for most NoSQL databases, which manage documents, columns, or key/value pairs as disconnected aggregates. To create any semblance of connectedness using these technologies, we must find a way to both denormalize data and fudge connections onto an inherently disconnected model. This is not a trivial undertaking, as we have discovered in building Neo4j itself!
Neo4j has come to fruition over the same timeframe as the other frontrunners in the NoSQL world. (In fact, Neo4j predates many other NoSQL technologies by several years.) Neo4j provides traditional database-like support (including transactional safety) for highly connected data, while also providing orders of magnitude (minutes to milliseconds) better performance than relational databases. For domains as varied as social computing, recommendation engines, telecoms, authorization and access control, routing and logistics, product catalogs, datacenter management, career management, fraud detection, policing, and geospatial, Neo4j has demonstrated its an ideal choice for tackling complex data.
Because Neo4j is by far the most popular graph database, its the one that most developers will encounter. We know that this first contact with a new technology like Neo4j can be bewildering. The tyranny of choice regarding different APIs, bindings, query languages, and frameworks can be daunting, and its easy to be put off.
Neo4j in Action addresses these concerns by getting developers up and running quickly with Neo4j. It takes a pragmatic programmatic tour through Neo4js APIs and its query language, and provides examples based on the authors extensive real-world use of the database. Complementing this development advice, the authors also discuss deployment options and solution architectures. The result is a rounded, holistic view of Neo4j as seen in the context of the full systems development lifecycle.
As Neo4j contributors and authors ourselves, we value Neo4j in Action for its no-nonsense, hands-on approach, and its willingness to back its assertions using reproducible tests. The authors are some of the most experienced Neo4j users around, and were very pleased to see their authority and knowledge made available to all developers through this book.
J IM W EBBER
C HIEF S CIENTIST , N EO T ECHNOLOGY
I AN R OBINSON
E NGINEER , N EO T ECHNOLOGY
Preface
Graph issues are some of the most common problems in computer programming, and have been since the early days. Back then, hierarchy trees, access control lists, and mapping tables were built, typically, in code. When it came time to store the graphs, programmers transformed them into tables and used the relational database as underlying storage. We had to do a lot of plumbing to save the most basic graph data, but there was no other optionuntil graph databases, with Neo4j leading the parade, entered the scene.
Neo4j started its journey more than a decade ago, with the first official version, the 1.0 release, coming out in 2010, and the more recent 2.0 release coming out in December 2013. Most of us have been involved with actively using Neo4j and watching it evolve over this period on various projects for clients. The hype and excitement around graph databases, and Neo4j in particular, have been gaining more and more traction, with many people and companies realizing that Neo4j is uniquely placed in the graph database space to provide a robust and solid solution capable of solving complex and challenging, interconnected business problems.
It is with great pleasure that we tried to distill much of this real-world experience and knowledge into this hands-on book in a way that lays solid foundations and then builds on those to help you get up and running with Neo4j as soon as possible.
Acknowledgments
This book has been some time in the making, so first and foremost a big thank you goes out to all of our families and friends who tirelessly stood by us, put up with us, and made those late evening coffees to keep us going through the many additional late hours of work required to write this book. Thank you!
First, wed like to thank Open Credo, the company for whom we currently work (or worked) while writing this book, for the opportunity afforded us to be able to share and contribute our experiences to this bookmostly after hours, but for those precious paid hours as well. This was most appreciated!