All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Published by Packt Publishing Ltd.
Birmingham, B27 6PA, UK.
Credits
Author
Gregory Smith
Reviewers
Kevin Grittner
Jim Mlodgenski
Scott Marlowe
Acquisition Editor
Sarah Cullington
Development Editors
Hyacintha D'Souza
Mayuri Kokate
Technical Editors
Sakina Kaydawala
Alfred John
Indexer
Hemangini Bari
Tejal Daruwale
Editorial Team Leader
Mithun Sehgal
Project Team Leader
Lata Basantani
Project Coordinator
Srimoyee Ghoshal
Proofreader
Aaron Nash
Production Coordinator
Aparna Bhagat
Cover Work
Aparna Bhagat
About the Author
Gregory Smith is a Principal Consultant for international database professional services firm 2ndQuadrant, and founder of the company's first United States office.
Writing about PostgreSQL represents his second foray into teaching database performance tuning. Greg wrote a small, free e-book titled Progress Performance FAQ in 1995, covering the basics of how to make the Progress 4GL and its associated database run faster. In 2001, he converted exclusively to using PostgreSQL 7.0 for projects, and has been watching the complexity of problems the database is capable of solving increase with every release since.
Greg has contributed feature additions to every PostgreSQL version since 8.3. He's also the creator of a growing set of add-on tools for the database, currently including pgtune, pgbench-tools, peg, and 2warm.
I was able to focus on the material in this book well enough to do it justice only through the support provided by Simon Riggs and the rest of the 2ndQuadrant staff around the world. The exposure to interesting problems to solve, and resources to solve them, has made working with 2ndQuadrant staff and clients a fertile source for PostgreSQL performance ideas over the last year.
The writing schedule pace needed to deliver a current book covering a major new database release just after it ships is grueling. I'd never have made it through so many weeks of working all seven days without the support of my family: Judy, Jerry, and Amanda.
Finally, the material in this book only exists because of the hundreds of contributors to the PostgreSQL project. And without the free sharing of ideas on mailing lists like pgsql-performance and pgsql-hackers the last few years, I'd never have been able to collect up such a wide survey of common performance issues. Whether it was having my own questions answered, or carefully considering how to answer someone else's, the interaction on those mailing lists has been vital to forming the ideas of this book.
About the Reviewers
Kevin Grittner has worked in the computer industry since 1972. While he has filled many roles during decades of consulting, working with databases has been a major focusparticularly in terms of optimization and providing frameworks for efficient application development against a database. In the mid 80s, he was the architect and primary author of the PROBER Database and Development Environment , which was never released commercially but enjoyed widespread use in certain vertical markets, such as fire departments, hospitals, and probation and parole agencies.
Jim Mlodgenski is Chief Architect at EnterpriseDB. He is one of EnterpriseDB's first employees, having joined the company in May, 2005. Over several years, Jim has been responsible for key activities such as sales engineering, professional services, strategic technology solutions delivery, and customer education.
Prior to joining EnterpriseDB, Jim was a partner and architect at Fusion Technologies, a technology services company. For nearly a decade, Jim developed early designs and concepts for Fusion's consulting projects and specialized in Oracle application development, web development, and open source information architectures.
I want to thank my wonderful wife Stacie and awesome son Paul for supporting me.
Scott Marlowe has over 25 years of experience in software development, system administration, and database development. His first program was a gradebook program for the Air Force and he's been hooked ever since.Scott works for Edline/Schoolfusion as a systems administrator and DBA.
I'd like to thank my two sons for being the greatest kids in the world, and my best friend Darren for all the expertise and knowledge we've shared in the last decade or so.
Preface
PostgreSQL has become an increasingly viable database platform to serve as storage for applications, from classic corporate database use to the latest web apps. But getting the best performance from it has not been an easy subject to learn. You need just the right combination of rules of thumb to get started, solid monitoring, and maintenance to keep your system running well, suggestions for troubleshooting, and hints for add-on tools to add the features the core database doesn't try to handle on its own.
What this book covers
, PostgreSQL Versions introduces how PostgreSQL performance has improved in the most recent versions of the databases. It makes a case for using the most recent version feasible, in contrast to the common presumption that newer versions of any software are buggier and slower than their predecessors.
, Database Hardware discusses how the main components in server hardware, including processors, memory, and disks, need to be carefully selected for reliable database storage and a balanced budget. In particular, accidentally using volatile write-back caching in disk controllers and drives can easily introduce database corruption.
, Database Hardware Benchmarking moves on to quantifying the different performance aspects of database hardware. Just how fast is the memory and raw drives in your system? Does performance scale properly as more drives are added?
, Disk Setup looks at popular filesystem choices and suggests the trade-offs of various ways to layout your database on disk. Some common, effective filesystem tuning tweaks are also discussed.
, Memory for Database Caching digs into how the database is stored on disk, in memory, and how the checkpoint process serves to reconcile the two safely. It also suggests how you can actually look at the data being cached by the database, to confirm whether what's being stored in memory matches what you'd expect to be there.