Practical Binary Analysis
PRACTICAL BINARY ANALYSIS
Build Your Own Linux Tools for Binary Instrumentation, Analysis, and Disassembly
by Dennis Andriesse
San Francisco
PRACTICAL BINARY ANALYSIS. Copyright 2019 by Dennis Andriesse.
All rights reserved. No part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval system, without the prior written permission of the copyright owner and the publisher.
ISBN-10: 1-59327-912-4
ISBN-13: 978-1-59327-912-7
Publisher: William Pollock
Production Editor: Riley Hoffman
Cover Illustration: Rick Reese
Interior Design: Octopod Studios
Developmental Editor: Annie Choi
Technical Reviewers: Thorsten Holz and Tim Vidas
Copyeditor: Kim Wimpsett
Compositor: Riley Hoffman
Proofreader: Paula L. Fleming
For information on distribution, translations, or bulk sales, please contact No Starch Press, Inc. directly:
No Starch Press, Inc.
245 8th Street, San Francisco, CA 94103
phone: 1.415.863.9900;
www.nostarch.com
Library of Congress Cataloging-in-Publication Data
Names: Andriesse, Dennis, author.
Title: Practical binary analysis : build your own Linux tools for binary
instrumentation, analysis, and disassembly / Dennis Andriesse.
Description: San Francisco : No Starch Press, Inc., [2019] | Includes index.
Identifiers: LCCN 2018040696 (print) | LCCN 2018041700 (ebook) | ISBN 9781593279134 (epub) | ISBN 1593279132 (epub) | ISBN
9781593279127 (print)
| ISBN 1593279124 (print)
Subjects: LCSH: Disassemblers (Computer programs) | Binary system
(Mathematics) | Assembly languages (Electronic computers) | Linux.
Classification: LCC QA76.76.D57 (ebook) | LCC QA76.76.D57 A53 2019 (print) |
DDC 005.4/5--dc23
LC record available at https://lccn.loc.gov/2018040696
No Starch Press and the No Starch Press logo are registered trademarks of No Starch Press, Inc. Other product and company names mentioned herein may be the trademarks of their respective owners. Rather than use a trademark symbol with every occurrence of a trademarked name, we are using the names only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark.
The information in this book is distributed on an As Is basis, without warranty. While every precaution has been taken in the preparation of this work, neither the author nor No Starch Press, Inc. shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the information contained in it.
For Noortje and Sietse
About the Author
Dennis Andriesse has a PhD in system and network security and uses binary analysis daily in his research. He is one of the main contributors to PathArmor, a control-flow integrity system that defends against control-flow hijacking attacks such as ROP. Andriesse was also one of the attack developers involved in the takedown of the GameOver Zeus P2P botnet.
About the Technical Reviewers
Thorsten Holz is a professor in the Faculty of Electrical Engineering and Information Technology at Ruhr-University Bochum, Germany. His research interests include technical aspects of secure systems with a focus on systems security. Currently, his work concentrates on reverse engineering, automated vulnerability detection, and studying the latest attack vectors.
Tim Vidas is a student of hacking. Over the years, Tim has led the DARPA CGC infrastructure team, championed innovation at Dell Secureworks, and overseen CERTs research group for digital forensics. He has a PhD from Carnegie Mellon, many conference badges (some are black), and an Erdos-Bacon number of 4-3. Mostly, Tim just enjoys being a father and husband.
CONTENTS IN DETAIL
ANATOMY OF A BINARY
THE ELF FORMAT
THE PE FORMAT: A BRIEF INTRODUCTION
BUILDING A BINARY LOADER USING LIBBFD
BASIC BINARY ANALYSIS IN LINUX
DISASSEMBLY AND BINARY ANALYSIS FUNDAMENTALS
SIMPLE CODE INJECTION TECHNIQUES FOR ELF
CUSTOMIZING DISASSEMBLY
BINARY INSTRUMENTATION
PRINCIPLES OF DYNAMIC TAINT ANALYSIS
PRACTICAL DYNAMIC TAINT ANALYSIS WITH LIBDFT
PRINCIPLES OF SYMBOLIC EXECUTION
PRACTICAL SYMBOLIC EXECUTION WITH TRITON
A
A CRASH COURSE ON X86 ASSEMBLY
B
IMPLEMENTING PT_NOTE OVERWRITING USING LIBELF
C
LIST OF BINARY ANALYSIS TOOLS
D
FURTHER READING
FOREWORD
These days, you can find many books on assembly and even more descriptions of the ELF and PE binary formats. Stacks of articles about information flow tracking and symbolic execution abound. Yet theres not a single book to take the reader from, say, understanding basic assembly to performing advanced binary analysis. Not a single book exists that shows the reader how to instrument binary programs, apply dynamic taint analysis to track interesting data through a program execution, or use symbolic execution for automated exploit generation. In other words, theres no book out there that teaches you the techniques, the tools, and the mind-set you need for binary analysis. Until now.
What makes binary analysis challenging is that it requires an understanding of many different things. Yes, you need to know about assembly, but you also need to know about binary formats, linking and loading, static and dynamic analysis, memory layouts, and compiler conventionsand these are just the basics. Your specific analysis or instrumentation tasks may require even more specialized knowledge. Of course, all of these aspects require their own tools. To many, this area looks so intimidating that they give up before they even get started. There is so much to learn. Where to start?
The answer is: here. This book brings together everything you need to know to get started in a well-structured and accessible manner. Its fun, too! Even if you dont know anything about what binary programs programs look like, how theyre loaded, or what happens when they execute, the book carefully introduces all these concepts with the corresponding tools so that you quickly learn not just how they work in theory but also how to play with them in realistic scenarios. In my opinion, this is the only way to gain a deep and lasting understanding.
Even if you already have significant experience in analyzing binary code and are perhaps a wizard in Capstone, Radare, IDA Pro, or OllyDbg (or whatever your favorite tools may be), there is plenty here to like. The advanced techniques in the later chapters will show you how to build some of the most sophisticated analysis and instrumentation tools you can imagine.
Binary analysis and binary instrumentation are fascinating but challenging topics, typically mastered only by a small group of expert hackers. With growing concerns about security, theyre also becoming increasingly important. We need to be able to analyze malware to see what it may do and how we may stop it. But as more and more malware obfuscates itself and applies anti-analysis techniques to thwart our analysis, we need more sophisticated methods.