Praise for Building Secure and Reliable Systems
It is very hard to get practical advice on how to build and operate trustworthy infrastructure at the scale of billions of users. This book is the first to really capture the knowledge of some of the best security and reliability teams in the world, and while very few companies will need to operate at Googles scale many engineers and operators can benefit from some of the hard-earned lessons on securing wide-flung distributed systems. This book is full of useful insights from cover to cover, and each example and anecdote is heavy with authenticity and the wisdom that comes from experimenting, failing and measuring real outcomes at scale. It is a must for anybody looking to build their systems the correct way from day one.
Alex Stamos, Director of the Stanford Internet Observatory and former CISO of Facebook and Yahoo
This book is a rare treat for industry veterans and novices alike: instead of teaching information security as a discipline of its own, the authors offer hard-wrought and richly illustrated advice for building software and operations that actually stood the test of time. In doing so, they make a compelling case for reliability, usability, and security going hand-in-hand as the entirely inseparable underpinnings of good system design.
Micha Zalewski, VP of Security Engineering at Snap, Inc. and author of The Tangled Web and Silence on the Wire
This is the real world that researchers talk about in their papers.
JP Aumasson, CEO at Teserakt and author of Serious Cryptography
Google faces some of the toughest security challenges of any company, and theyre revealing their guiding security principles in this book. If youre in SRE or security and curious as to how a hyperscaler builds security into their systems from design through operation, this book is worth studying .
Kelly Shortridge, VP of Product Strategy at Capsule8
If youre responsible for operating or securing an internet service: caution! Google and others have made it look too easy. Its not. I had the privilege of working with these book authors for many years and was constantly amazed at what they uncovered and their extreme measures to protect our users data. If you have such responsibilities yourself, or if youre just trying to understand what it takes to protect services at scale in the modern world, study this book. Nothing is covered in detailthere are other references for thatbut I dont know anywhere else that youll find the breadth of pragmatic tips and frank discussion of tradeoffs .
Eric Grosse, former VP of Security Engineering at Google
Building Secure and Reliable Systems
by Heather Adkins , Betsy Beyer , Paul Blankinship , Piotr Lewandowski , Ana Oprea , and
Adam Stubblefield
Copyright 2020 Google LLC. All rights reserved.
Printed in the United States of America.
Published by OReilly Media, Inc. , 1005 Gravenstein Highway North, Sebastopol, CA 95472.
OReilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (https://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com .
Acquisitions Editor: John Devins | Indexer: WordCo, Inc. |
Development Editor: Virginia Wilson | Interior Designer: David Futato |
Production Editor: Kristen Brown | Cover Designer: Karen Montgomery |
Copyeditor: Rachel Head | Illustrators: Jenny Bergman and Rebecca Demarest |
Proofreader: Sharon Wilkey |
- March 2020: First Edition
Revision History for the First Edition
- 2020-03-11: First Release
See https://oreilly.com/catalog/errata.csp?isbn=9781492083122 for release details.
The OReilly logo is a registered trademark of OReilly Media, Inc. Building Secure and Reliable Systems, the cover image, and related trade dress are trademarks of OReilly Media, Inc.
The views expressed in this work are those of the authors, and do not represent the publishers views or the views of the authors employer (Google). While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher, the authors, and Google disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
This work is part of a collaboration between OReilly and Google. See our statement of editorial independence.
978-1-492-08313-9
[LSI]
Dedication
To Susanne, whose strategic project management and passion for reliability and security kept this book on track!
Foreword by Royal Hansen
For years, Ive wished that someone would write a book like this. Since their publication, Ive often admired and recommended the Google Site Reliability Engineering (SRE) booksso I was thrilled to find that a book focused on security and reliability was already underway when I arrived at Google, and am only too happy to contribute in a small way to the process. Ever since I began working in the tech industry, across organizations of varying sizes, Ive seen people struggling with the question of how security should be organized: Should it be centralized or federated? Independent or embedded? Operational or consultative? Technical or governing? The list goes on.
When the SRE model, and SRE-like versions of DevOps, became popular, I noticed that the problem space SRE tackles exhibits similar dynamics to security problems. Some organizations have combined these two disciplines into an approach called DevSecOps.
Both SRE and security have strong dependencies on classic software engineering teams. Yet both differ from classic software engineering teams in fundamental ways:
Site Reliability Engineers (SREs) and security engineers tend to break and fix, as well as build.
Their work encompasses operations, in addition to development.
SREs and security engineers are specialists, rather than classic software engineers.
They are often viewed as roadblocks, rather than enablers.
They are frequently siloed, rather than integrated in product teams.
SRE created a role and responsibilities specific to a set of skills, which we can see as analogous to the role of security engineer. SRE also created an implementation model that connects teams, and this seems to be the next step that the security community needs to take. For many years, my colleagues and I have argued that security should be a first-class and embedded quality of software. I believe that embracing an SRE-inspired approach is a logical step in that direction.
Since arriving at Google, Ive learned more about how the SRE model was established here, how SRE implements DevOps philosophies, and how SRE and DevOps have evolved. Meanwhile, Ive been translating my IT security experience in the financial services industry to the technical and programmatic security capabilities at Google. These two sectors are not unrelated, but each has its own history worth understanding. At the same time, enterprises are at a critical point where cloud computing, various forms of machine learning, and a complicated cybersecurity landscape are together determining where an increasingly digital world is going, how quickly it will get there, and what risks are involved.