Build Your OwnDistributed Compilation Cluster A PracticalWalkthrough
Hunter Davis
Copyright 2011 by HunterDavis
SmashwordsEdition
Smashwords LicenseStatement
This ebook is licensed for your personal enjoyment only. This ebookmay not be re-sold or given away to other people. If you would liketo share this book with another person, please purchase anadditional copy for each reader. If youre reading this book anddid not purchase it, or it was not purchased for your use only,then please return to Smashwords.com and purchase your own copy.Thank you for respecting the hard work of this author.
Introduction
Hello interested readers. My name is HunterDavis and I run a start-up called Discursive Labs. For the past tenyears Ive been publishing software and hacks on my websitewww.hunterdavis.com, and in the process Ive written manyinstructional guides. During the course of my tenure at DiscursiveLabs, I ran a series of articles about compilers, low powercompilation clusters, and the like. I walked our readers through afull cross-compilation cluster installation, the creation of adistributed make system from scratch, and the headaches and hurdlesthat come with such an endeavor. It is my intention that those whomake it through this book will have gained both a practicalknowledge of these systems as well as a valuable roadmap aroundsome of the nastier pitfalls.
Throughout the following 6 how-to articles,Ill take you through the process of building a fully workingcross-compilation distributed build system. This system will begeneric enough to apply to most any compilation environment, whileremaining powerful enough to outperform all but the most advancedcompilation systems. With source code examples provided and easystep by step instructions, this 60+ page instructional eBook is avaluable introductory and practical resource for those interestedin distributed compilation, cross compilation, low power computingclusters, and so much more. Its also one terrific bargain, and anexcellent reference.
Cutting Development Costsand Carbon Footprints through Alternative ClusterArchitectures
We are all familiar with the standard CPUarchitectures most enterprise-level developers support. Single- ormulti-core 32-bit, and increasingly, 64-bit instruction sets, x86or high end PPC chipsets. This is all fine and dandy most of thetime and covers the most common platforms for an enterprise app. Ifyour software is truly cross-platform, however, there are a worldof cost-saving and performance-improving CPU technologies waitingjust outside your door; the performance and cost benefits oflow-power and alternative CPU architectures may surprise you.Though it isnt applicable to everyone, Im going to show how usinga compilation cluster of low-power ARM processors can save moneyand significantly reduce your carbon footprint.
First and foremost, throw away what youthink you know about low end processors for high-performancecomputing; the world has changed! Low end 1GHz and near-1GHz ARMreference boards (ala the pogoplug) are often available throughcommercial channels for less than $50, and through manufacturingchannels at potential discount, depending on your distributor.These boards can run stock commercial operating systems, and mostsoftware development libraries have been ported (Ubuntu ARM hasbecome quite popular). Though theres an interesting argument tomake for the execution side of high-performance software on low endprocessors (something Im sure Ill get into in another blog post),Ill be concentrating on the development side of things.Historically, any green improvements to development have come atthe cost of decreased performance or increased price. This putsbusiness owners in a difficult position: Improving the ecologicalimpact of a company must either hurt the companys bottom-line orhurt the efficiency and, often, the day-to-day morale of thecompanys developers. For a lot of small companies, the onlycompromise is against the efficiency and morale of thedevelopers.
During development at a small business thereare, at minimum, 2 stakeholders who are vested in the developmentprocess: the developer and the business itself. Consider adeveloper working on a large project. He may have a local computingmachine for compiles, a remote compile cluster or some more exoticconfiguration. Regardless of the individual setup, the developer isactively engaged during coding and debugging but is generallydisengaged during compilation. For this reason, developers arealmost always concerned with the compilation time; the faster itcompiles, the sooner the developer can re-engage. So as astakeholder, the developer is concerned with the time it takes forcompilation.
Now, consider the interests of the business.Compilation time, from a business standpoint, is not directlyproductive; faster compilation means more productive developer timewhich can be spent fixing bugs and adding features. In addition tothis time loss, there are direct financial losses. The resourcesspent towards compilation, including the amortized cost of themachines used and the standard costs of operating those machines,directly impact the companys bottom-line. Thus the company, as astakeholder, is concerned with both time and cost! As such, whenplanning technical infrastructure a business has a number offinancial factors to consider:
The initial cost of a developersworkstation and/or compilation cluster resources.
The lost productivity costs of an idledeveloper, assuming that compilation time is inherently idle timefor a developer.
The electricity and running cost of theworkstation and/or cluster resources.
Additional, ancillary infrastructure costsassociated with the workstation and/or cluster resources (networkhardware, service fees, licenses, etc.).
These various costs can be estimated andgraphed using any number of business tools (or if you work forDiscursive Labs, our sliders tool). In the example below, Ill useactual performance numbers and hardware costs from our internaldevelopment process. Though there are a great deal of advantages toa low-power chip clusters, Ill be concentrating on the twostakeholders primary concerns: speed and cost.
When it comes to computers, people thinkperformance implies expensive. Although this was almostentirely true in the recent past standard x86 pricing appear tobear this out recent processor lines are causing this tochange.
Consider the time taken to compile one ofour internal projects, Source Tree Visualizer:
Clearly, the quad-core desktop trumps thesingle ARM board in direct compilation time. Now consider adifferent problem. How many networked, single-core ARM boards doesit take to compile in the same amount of time as the quad-core?While compilation is not a linear processes, it is inherentlyparallelizable and modern compilation systems are quite adept atsubdividing the task. If we assume that, on average, eachcompilation unit requires the same computation time then we justneed to estimate the overhead to get an approximate speed increase.In a one-to-many master-slave configuration, overhead forparallelization and interconnect latency with a cluster of thissize is at worst 20% (admittedly, this is based solely on priorexperience). Therefore, we should generously be able to match thecompilation speed of the quad-core with one master node and 6 slavenodes this is assuming that only the slaves nodes performcompilation tasks and the master is only responsible for taskcoordination.
Thats seven computers. Seems like a fairlylarge number of computers to be compiling a single program, doesntit? Especially when considering the electricity usage and initialhardware costs, it might be reasonable to assume that the quad-coredesktop would be the more economical purchase. The issue is notthat clear-cut, however. Lets go over the numbers, starting withinitial cost: