Java Performance: The Definitive Guide
Scott Oaks
Preface
When OReilly first approached me about writing a book on Java performancetuning, I was unsure. Java performance, I thoughtarent we donewith that? Yes, I still work on performance of Java (and other) applicationson a daily basis, but I like to think that I spend most of my time dealingwith algorithmic inefficiences and external system bottlenecks ratherthan on anythingdirectly related to Java tuning.
A moments reflection convinced me that I was (as usual) kidding myself. Itis certainly true that end-to-end system performance takes up a lot of mytime, and that I sometimes come across code that uses an algorithmwhen it could use one with O(log N) performance. Still, it turns out thatevery day, Ithink about GC performance, or the performance of the JVM compiler, or howto get the best performance from Java Enterprise Edition APIs.
That is not to minimize the enormous progress that has been made in theperformance of Java and JVMs over the past 15-plus years. When I was a Javaevangelist at Sun during the late 1990s, the only real benchmark availablewas CaffeineMark 2.0 from Pendragon software. For a variety of reasons,the design of that benchmark quickly limited its value;yet in its day, we were fond of tellingeveryone that Java 1.1.8 performance was eight times faster than Java 1.0performance based on that benchmark. And that was trueJava 1.1.8 had anactual just-in-time compiler, where Java 1.0 was pretty much completelyinterpreted.
Then standards committees began to develop more rigorous benchmarks, andJava performance began to be centered around them. The result was acontinuous improvement in all areas of the JVMgarbage collection,compilations, and within the APIs. That process continues today, ofcourse, but one of the interesting facts about performance work is that itgets successively harder. Achieving an eightfold increase in performance byintroducing a just-in-time compiler was a straightforward matter ofengineering, and even though the compiler continues to improve, were notgoing to see an improvement like that again. Paralellizing the garbagecollector was a huge performance improvement, but more recent changeshave been more incremental.
This is a typical process for applications (and the JVM itself is justanother application): in the beginning of a project, its easy enoughto find archictural changes (or code bugs) which, when addressed, yieldhuge performance improvements. In a mature application, finding suchperformance improvements is quite rare.
That precept was behind my original concern that, to a large extent,the engineering world might be done with Java performance. A few thingsconvinced me I was wrong. First is the number of questions Isee daily about how this or that aspect of the JVM performs under certaincircumstances. New engineers come to Java all the time, and JVM behaviorremains complex enough in certain areas that a guide to how it operatesis still beneficial. Second is that environmental changes in computingseem to have altered the performance concerns that engineers face today.
Whats changed in the past few years is that performance concerns havebecome bifurcated. On the one hand, very large machines capabable of runningJVMs with very large heaps are now commonplace. The JVM has moved to addressthose concerns with a new garbage collector (G1), whichas a newtechnologyrequires a little more hand-tuning than traditional collectors.At the same time, cloud computing has renewed the importance of small,single-CPUmachines: you can go to Oracle or Amazon or a host of other companies andvery cheaply rent a single CPU machine to run a small applicationserver. (Youre not actually getting a single-CPU machine: youregetting a virtual OS image on a very large machine, but the virtualOS is limited to using a single CPU. From the perspective of Java, thatturns out to be the same as single-CPU machine.) In those environments,correctly managing small amounts of memory turns out to be quite important.
The Java platform also continues to evolve. Each new edition of Javaprovides new language features and new APIs that improve the productivityof developersif not always the performance of their applications. Bestpractice use of these language features can help to differentiate betweenan application that sizzles, and one that plods along. And the evolutionof the platform brings up interesting performance questions: there is noquestion that using JSON to exchange information betweentwo programs is much simpler than coming up with a highly optimizedproprietary protocol. Saving time for developers is a big winbut makingsure that productivity win comes with a performance win (or at least breaks even) is the real goal.
Who Should (and Shouldnt) Read This Book
This book is designed for performance engineers and developers who arelooking to understand how various aspects of the JVM and the Java APIsimpact performance.
If it is late Sunday night, your site is going live Monday morning, andyoure looking for a quick fix for performance issues, this isnot the book for you.
If you are new to performance analysis and are starting that analysis inJava, then this book can help you. Certainly my goal is to provide enoughinformation and context that novice engineers can understand how to applybasic tuning and performance principles to a Java application. However,system analysis is a very broad field. There are a number of excellentresources for system analysis in general (and those pricincples of courseapply to Java), and in that sense, this book will hopefully be a usefulcompanion to those texts.
At a fundamental level, though, making Java go really fast requires a deepunderstanding about how the JVM (and Java APIs) actually work. There areliterally hundreds of Java tuning flags, and tuning the JVM has to be morethan an approach of blindly trying them and seeing what works. Instead,my goal is to provide some very detailed knowledge about what the JVM andAPIs are doing, with the hope that if you understand how those things work,youll be able to look at the specific behavior of an application andunderstand why it is performing badly. Understanding that, it becomesa simple (or at least simpler) task to get rid of undesirable(badly performing) behavior.
One interesting aspect to Java performance work is that developers oftenhave a very different background than engineers in a performance or QA group.I know developers who can remember thousands of obscure method signatures onlittle-used Java APIs but who have no idea what the flag -Xmn
means.And I know testing engineers who can get every last ounce of performance fromsetting various flags for the garbage collector but who could barely writea suitable Hello, World program in Java.
Java performance covers both of these areas: tuning flags for the compilerand garbage collector and so on, and best-practice uses of the APIs. So Iassume that you have a good understanding of how to write programs in Java.Even if your primary interest is not in the programming aspects of Java, Ido spent a fair amount of time discussing programs, including the sampleprograms used to provide a lot of the data points in the examples.
Still, if your primary interest is in the performance of the JVM itselfmeaning how to alter the behavior of the JVM without any codingthen largesections of this book should still be beneficial to you. Feel free to skipover the coding parts and focus in on the areas that interest you. Andmaybe along the way, youll pick up some insight into how Java applicationscan affect JVM performance and start to suggest changes to developersso they can make your performance-testing life easier.