Guy Harrison and Michael Harrison
MongoDB Performance Tuning
Optimizing MongoDB Databases and their Applications
1st ed.
Logo of the publisher
Guy Harrison
Kingsville, VIC, Australia
Michael Harrison
Derrimut, VIC, Australia
Any source code or other supplementary material referenced by the author in this book is available to readers on GitHub via the books product page, located at www.apress.com/9781484268780 . For more detailed information, please visit http://www.apress.com/source-code .
ISBN 978-1-4842-6878-0 e-ISBN 978-1-4842-6879-7
https://doi.org/10.1007/978-1-4842-6879-7
Guy Harrison, Michael Harrison 2021
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Distributed to the book trade worldwide by Springer Science+Business Media LLC, 1 New York Plaza, Suite 4600, New York, NY 10004. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-sbm.com, or visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation.
Dedicated to my darling Jenny, who makes my life joyful.Guy
Dedicated to Oriana, without whom this book would have been completed much sooner.Mike
Introduction
When MongoDB emerged in 2009, database technologies were at a crossroads. For more than 20 years, relational databases such as Oracle, SQL Server, and MySQL had dominated the database market. These databases, which combined the relational data model, SQL language, and ACID transactions, had been the foundation for applications that transformed modern business and which powered the Internet revolution. But by the middle of the first decade of the new century, it was clear that the relational database was failing to meet the demands of a new breed of always-on, globally scalable, web applications. These new Web 2.0 applications demanded new breeds of database management systems.
By 2010, a plethora of non-relational NoSQL systems had emerged Hadoop, HBase, Cassandra, and many others. Of these non-relational upstarts, MongoDB has been by almost any measure the most successful. As we write this, MongoDB ranks as one of the top five database management systems. Of these top five, only MongoDB is based on 21st-century technologies. The other four (Oracle, MySQL, SQL Server, and Postgres) all have their origins in the 1980s and 1990s.
MongoDBs success can be ascribed to many factors such as alignment with object-oriented programming paradigms and compatibility with modern DevOps practices. In the main, MongoDB has thrived because it made life easier for developers. However, in the past few years, weve seen MongoDB graduate from a by developers for developers database to a platform supporting a new generation of mission-critical systems across an increasingly broad range of enterprises.
As MongoDB has matured and expanded its enterprise footprint, performance management has become increasingly important. As we know, poorly performing customer-facing applications can be fatal for todays online enterprise. For instance, when the load time for a web page increases from 1 second to 5 seconds, the probability of a user abandoning the page rises by 90% directly impacting online revenue. And because databases perform so much disk IO and data crunching, the database is often the root cause of that poor performance.
Furthermore, in the cloud, performance management is cost management: poorly performing databases consume unnecessary CPU, memory, and IO resources that cost real money. A couple of days spent tuning a large-scale MongoDB-based cloud application could potentially save hundreds of thousands of dollars in hosting fees.
Indeed, we could even argue that performance management is an environmental imperative. The electricity that powers busy database servers costs more than just money its also associated with greenhouse gas production. Reducing energy consumption in the home is a social responsibility; reducing energy consumption in the data center is as important. A badly tuned MongoDB database is like a poorly tuned car that backfires and belches smoke: it may get you from A to B, but it will cost you more in gas and exact a heavier toll on the environment.
This book is our attempt to produce a coherent and comprehensive MongoDB tuning manual. To that end, we set out with the following objectives:
To provide a methodology for MongoDB performance tuning that addresses performance issues systematically and efficiently. In particular, this methodology attempts to address causes before symptoms.
To address all aspects of MongoDB performance management, from database design through to the tuning of application code and on to server and cluster optimization.
To maintain a strong focus on tuning fundamentals. Fundamentals are usually where the most significant performance gains can be achieved and if not addressed usually limit the benefits gained through the application of advanced techniques.
How This Book Is Structured
The chapters of this book fall into the following broad parts:
Chapters cover methods and techniques. In these chapters, we describe a performance tuning methodology that we believe provides the most effective means of tuning MongoDB databases. We also offer some background on MongoDB architecture and on the tools that MongoDB provides for investigating, monitoring, and diagnosing MongoDB performance.
Chapters cover application and database design. Here, we cover the basics of developing an efficient document model and of indexing MongoDB collections.
Chapters cover the optimization of application code. Tuning your application code usually offers the most significant database performance opportunities and should be addressed before adjusting your server or cluster configuration. Well look at how to optimize MongoDB find() statements, aggregation pipelines, and data manipulation statements.