About This eBook
ePUB is an open, industry-standard format for eBooks. However, support of ePUB and its many features varies across reading devices and applications. Use your device or app settings to customize the presentation to your liking. Settings that you can customize often include font, font size, single or double column, landscape or portrait mode, and figures that you can click or tap to enlarge. For additional information about the settings and features on your reading device or app, visit the device manufacturers Web site.
Many titles include programming code or configuration examples. To optimize the presentation of these elements, view the eBook in single-column, landscape mode and adjust the font size to the smallest setting. In addition to presenting code and configurations in the reflowable text format, we have included images of the code that mimic the presentation found in the print book; therefore, where the reflowable format may compromise the presentation of the code listing, you will see a Click here to view code image link. Click the link to view the print-fidelity code image. To return to the previous page viewed, click the Back button on your device or app.
Bayesian Methods for Hackers
Probabilistic Programming and Bayesian Inference
Cameron Davidson-Pilon
New York Boston Indianapolis San Francisco
Toronto Montreal London Munich Paris Madrid
Capetown Sydney Tokyo Singapore Mexico City
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals.
The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein.
For information about buying this title in bulk quantities, or for special sales opportunities (which may include electronic versions; custom cover designs; and content particular to your business, training goals, marketing focus, or branding interests), please contact our corporate sales department at or (800) 382-3419.
For government sales inquiries, please contact .
For questions about sales outside the United States, please contact .
Visit us on the Web: informit.com/aw
Library of Congress Cataloging-in-Publication Data
Davidson-Pilon, Cameron.
Bayesian methods for hackers : probabilistic programming and bayesian inference / Cameron Davidson-Pilon.
pages cm
Includes bibliographical references and index.
ISBN 978-0-13-390283-9 (pbk.: alk. paper)
1. Penetration testing (Computer security)Mathematics. 2. Bayesian statistical decision theory.
3. Soft computing. I. Title.
QA76.9.A25D376 2015
006.3dc23
2015017249
Copyright 2016 Cameron Davidson-Pilon
All rights reserved. Printed in the United States of America. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. To obtain permission to use material from this work, please submit a written request to Pearson Education, Inc., Permissions Department, 200 Old Tappan Road, Old Tappan, New Jersey 07675, or you may fax your request to (201) 236-3290.
The code throughout and in this book is released under the MIT License.
ISBN-13: 978-0-13-390283-9
ISBN-10: 0-13-390283-8
Text printed in the United States on recycled paper at RR Donnelley in Crawfordsville, Indiana.
First printing, October 2015
This book is dedicated to many important relationships: my parents,
my brothers, and my closest friends. Second to them, it is devoted
to the open-source community, whose work we consume every
day without knowing.
Contents
Foreword
Bayesian methods are one of many in a modern data scientists toolkit. They can be used to solve problems in prediction, classification, spam detection, ranking, inference, and many other tasks. However, most of the material out there on Bayesian statistics and inference focuses on the mathematical details while giving little attention to the more pragmatic engineering considerations. Thats why Im very pleased to have this book joining the series, bringing a much needed introduction to Bayesian methods targeted at practitioners.
Camerons knowledge of the topic and his focus on tying things back to tangible examples make this book a great introduction for data scientists or regular programmers looking to learn about Bayesian methods. This book is filled with examples, figures, and working Python code that make it easy to get started solving actual problems. If youre new to data science, Bayesian methods, or new to data science with Python, this book will be an invaluable resource to get you started.
Paul Dix
Series Editor
Preface
The Bayesian method is the natural approach to inference, yet it is hidden from readers behind chapters of slow, mathematical analysis. The typical text on Bayesian inference involves two to three chapters on probability theory, then enters into what Bayesian inference is. Unfortunately, due to the mathematical intractability of most Bayesian models, the reader is only shown simple, artificial examples. This can leave the user with a So what? feeling about Bayesian inference. In fact, this was my own prior opinion.
After some recent success of Bayesian methods in machine-learning competitions, I decided to investigate the subject again. Even with my mathematical background, it took me three straight days of reading examples and trying to put the pieces together to understand the methods. There was simply not enough literature bridging theory to practice. The problem with my misunderstanding was the disconnect between Bayesian mathematics and probabilistic programming. That being said, I suffered then so the reader would not have to now. This book attempts to bridge the gap.
If Bayesian inference is the destination, then mathematical analysis is a particular path toward it. On the other hand, computing power is cheap enough that we can afford to take an alternate route via probabilistic programming. The latter path is much more useful, as it denies the necessity of mathematical intervention at each step; that is, we remove often intractable mathematical analysis as a prerequisite to Bayesian inference. Simply put, this latter computational path proceeds via small, intermediate jumps from beginning to end, whereas the first path proceeds by enormous leaps, often landing far away from our target. Furthermore, without a strong mathematical background, the analysis required by the first path cannot even take place.
Bayesian Methods for Hackers is designed as an introduction to Bayesian inference from a computational/understanding first, and mathematics second, point of view. Of course, as an introductory book, we can only leave it at that: an introductory book. For the mathematically trained, the curiosity this text generates may be cured by other texts designed with mathematical analysis in mind. For the enthusiast with a less mathematical background, or one who is not interested in the mathematics but simply the practice of Bayesian methods, this text should be sufficient and entertaining.