Supplemental files and examples for this book can be found at http://examples.oreilly.com/9780596009571/. Please use a standard desktop web browser to access these files, as they may not be accessible from all ereader devices.
All code files or examples referenced in the book will be available online. For physical books that ship with an accompanying disc, whenever possible, weve posted all CD/DVD content. Note that while we provide as much of the media content as we are able via free download, we are sometimes limited by licensing restrictions. Please direct any questions or concerns to .
Preface
In the summer of 2003, somebody on the MySQL mailing list proposed a book about MySQL internals. As I read the email, I realized that I had the background to write such a book, but I had just finished writing my first book and was not looking forward to writing another. I tried to talk myself out of the responsibility, saying to myself nobody would ever publish a book so technical and specialized. There simply would not be enough of an audience for it.
Then I thought of Understanding the Linux Kernel and Linux Device Drivers by O'Reilly. That took away my excuse. I realized the door was open and I was standing in the doorway, but my inertia was keeping something good from happening. I thought about a passage in the Book of Mormon that says "a natural man is an enemy to God," and the principle behind it. If you drift along, seeking only the pleasure of the moment and staying safely within your natural comfort zone, you do not accomplish much. Good things happen when you push yourself outside of your comfort zone, doing what is difficult but what you know deep inside is the right thing to do. I wrote an email with a proposal to O'Reilly.
Interestingly enough, my editor happened to be Andy Oram, who also participated in the publication of Understanding the Linux Kernel and Linux Device Drivers . He and I worked together on this book, and I appreciate his help very much. I felt that his strengths very well compensated for my weaknesses.
The book presented a number of challenges. Writing about the internals of an application means approaching it as a developer rather than just a user or an administrator. It requires a deeper level of understanding. Although I had worked on the MySQL source code extensively, I found myself doing a lot of research to figure out the gory details of algorithms, the purposes of functions and classes, the reasons for certain decisions, and other matters relevant to this book. In addition, as I was writing the book, MySQL developers were writing new code. It was not easy to keep up. And while the book was being written, I had to do other work to feed my growing family. Fortunately, a good portion of that work involved projects that dealt with MySQL internals, allowing me to stay on top of the game.
Nevertheless, the challenges were worth it. Growth comes through challenges, and I feel it did for me in this process. Now that I have finished the book, I have a better view of the design of MySQL as a whole, and a better knowledge of its dark and not so dark parts. It is my hope that the reader will experience a similar growth.
How This Book Is Organized
,
MySQL History and ArchitectureIntroduces the major modules in the source code and their purpose.
,
Nuts and Bolts of Working with the MySQL Source CodeTells you how to download the source code and build a server from scratch.
,
Core Classes, Structures, Variables, and APIsLists the basic data structures, functions, and macros you need for later reference.
,
Client/Server CommunicationLays out the formats of the data sent between client and server, and the main functions that perform the communication.
,
Configuration VariablesDiscusses how MySQL handles configuration in general, as well as the effects of many particular configuration variables, and shows you a framework for adding a new configuration variable.
,
Thread-Based Request HandlingExplains MySQL's reasons for using threads and the main variables, such as locks, related to threads.
,
The Storage Engine InterfaceDescribes the relation of individual storage engines (formerly known as table types) to the MySQL core, and shows you a framework for adding a new storage engine.
,
Concurrent Access and LockingExplains the different types of locks available in MySQL, and how each storage engine uses locks.
,
Parser and OptimizerExplains the major activities that go into optimizing queries.
,
Storage EnginesBriefly describes the most important MySQL storage engines and some of the tree structures and other data structures they employ.
,
TransactionsLists the main issues required to support transactions, and uses InnoDB to illustrate the typical architecture used to provide that support.
,
ReplicationGives on overview of replication with an emphasis on issues of implementation.
Who This Book Is For
This book can be useful for a number of readers: a developer trying to extend MySQL in some way; a DBA or database application programmer interested in how exactly MySQL runs his queries; a computer science student learning about database kernel development; a developer looking for ideas while working on a product that requires extensive database functionality that he must implement himself; a closed-source database developer wondering how in the world MySQL runs its queries so fast; a random, curious computer geek who has used MySQL some and wonders what is inside; and, of course, anybody who wants to look smart by having a book on MySQL internals displayed on his shelf.
Although MySQL source is open in the sense of being publicly available, it is in essence closed to you if you do not understand it. It may be intimidating to look at several hundred thousand lines of code written by gifted programmers that elegantly and efficiently solves difficult problems one line at a time. To understand the code, you will need a measure of the inspiration and perspiration of those who created it. Hopefully, this book can provide enough guidance to remove those barriers and to open the source of MySQL for you.
I do not believe it is possible to understand and appreciate MySQL strictly through a conceptual discussion. On a high conceptual level MySQL is very simple. It does not implement many revolutionary ideas. It sticks to the basics. Why is it so popular then? Why do we know enough about it for O'Reilly to be willing to publish a book on its internals?
The reason, in my opinion, is that what makes a good database is not so much the concepts behind it, but how well they are implemented. It is important to be conceptually sound on a basic level, but a good portion of the genius is in implementing those concepts in a way that provides a reasonable combination of good performance and the ease of maintenance. In other words, the devil is in the details, and MySQL developers have done a great job of taking that devil by the horns and twisting his head off.
Thus, in order to appreciate the inner workings of MySQL, you need to get close to the places where that devil is being subdued. Somewhere in the dark depths of the optimizer or inside the B-tree, there is music to be heard as you study the code. It will take some work to hear that music, but once you do, you can feel its beauty. And to hear the music you must not be afraid to compile the code, add a few debugging messages to help you understand the flow, and perhaps even change a few things to appreciate what will make the server crash (and how) if you fail to handle something that turns out to be important after all.