Note
Safari Books Online is an on-demand digital library that delivers expert content in both book and video form from the worlds leading authors in technology and business.
Technology professionals, software developers, web designers, and business and creative professionals use Safari Books Online as their primary resource for research, problem solving, learning, and certification training.
Safari Books Online offers a range of plans and pricing for enterprise, government, education, and individuals.
Members have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like OReilly Media, Prentice Hall Professional, Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technology, and hundreds more. For more information about Safari Books Online, please visit us online.
How to Contact Us
Please address comments and questions concerning this book to the publisher:
- OReilly Media, Inc.
- 1005 Gravenstein Highway North
- Sebastopol, CA 95472
- 800-998-9938 (in the United States or Canada)
- 707-829-0515 (international or local)
- 707-829-0104 (fax)
We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at http://bit.ly/building-applications-on-mesos.
To comment or ask technical questions about this book, send email to .
For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.
Find us on Facebook: http://facebook.com/oreilly
Follow us on Twitter: http://twitter.com/oreillymedia
Watch us on YouTube: http://www.youtube.com/oreillymedia
Acknowledgments
This book took a huge amount of work, and it wouldnt have been possible without the help and support of many people.
First, Id like to thank Brian Foster and the team at OReilly. They did so much to make this book a reality.
Id also like to thank Two Sigma, my employer, for giving me the time and support to write this book.
The quality of the book was improved immeasurably thanks to the feedback and reviews I received from Matt Adereth, Adam Bordelon, Niklas Nielsen, and David Palaitis.
Finally, Id like to thank my wife, Aysylu Greenberg, for her love and support throughout the writing process.
Chapter 1. Introduction to Mesos
Lets take a trip back in time, to the year 1957. Computers that use transistors are starting to proliferate across universities and research laboratories. There is a problem, thoughonly one person can use a computer at a time. So, we have paper sign-up sheets so that we can reserve time slots on the machines. Since computers are so much more powerful than pencil and paper, they are in high demand. At the same time, since the computers are so expensive, if people dont use their whole reservations, then thousands of dollars of compute-time could be wasted! Luckily, the idea of operating systems already existed, more or less, at the time. A brilliant man named John McCarthy, who also invented LISP, had a great ideawhat if all the users could submit their jobs to the computer, and the computer would automatically share its CPU resources among the many different jobs?
Jobs Became Programs
What we now call applications or programs used to be called jobs.We still can see this terminology in our shells, where, once weve backgrounded a process, we use the jobs
command to inspect all the programs weve launched in the shell.
Once a single machine could be shared between jobs, we didnt need humans to supervise the sign-up sheetsnow, everyone could use the machine, and share it more easily, since the machine could enforce quotas, priorities, and even egalitarian fairness (if so desired).
Fast-forward to 2010: with the falling costs of networked data transmission and storage, its now possible to store every bit of information you can collect.To process all this data, you probably need to use Storm (a distributed real-time data processing system) and Hadoop.So, you get a whole mess of machines: a few for the Hadoop JobTracker and Storm Nimbus (each with its own painstakingly crafted configuration), a few more for the HDFS NameNode and Secondary NameNode, 15 more that youll install Hadoop TaskTrackers and HDFS DataNodes on, and 10 more that you use to run Storm Supervisors.At this point, youve managed to purchase 30 machines. However, if you decide that instead youd like to use five of your Hadoop machines as Storm workers, youre in for painful task, because you now need to completely reprovision the Hadoop machines as Storm machinesa process that, as many practitioners can vouch, is not as easy as wed wish.