Contents
Page List
Statistical Rethinking
CHAPMAN & HALL/CRC
Texts in Statistical Science Series
Joseph K. Blitzstein, Harvard University, USA
Julian J. Faraway, University of Bath, UK
Martin Tanner, Northwestern University, USA
Jim Zidek, University of British Columbia, Canada
Recently Published Titles
Theory of Spatial Statistics
A Concise Introduction
M.N.M van Lieshout
Bayesian Statistical Methods
Brian J. Reich and Sujit K. Ghosh
Sampling
Design and Analysis, Second Edition
Sharon L. Lohr
The Analysis of Time Series
An Introduction with R, Seventh Edition
Chris Chatfield and Haipeng Xing
Time Series
A Data Analysis Approach Using R
Robert H. Shumway and David S. Stoffer
Practical Multivariate Analysis, Sixth Edition
Abdelmonem Afifi, Susanne May, Robin A. Donatello, and Virginia A. Clark
Time Series: A First Course with Bootstrap Starter
Tucker S. McElroy and Dimitris N. Politis
Probability and Bayesian Modeling
Jim Albert and Jingchen Hu
Surrogates
Gaussian Process Modeling, Design, and Optimization for the Applied Sciences
Robert B. Gramacy
Statistical Analysis of Financial Data
With Examples in R
James Gentle
Statistical Rethinking
A Bayesian Course with Examples in R and Stan, Second Edition
Richard McElreath
For more information about this series, please visit: https://www.crcpress.com/ChapmanHallCRC-Texts-in-Statistical-Science/book-series/CHTEXSTASCI
Statistical Rethinking
A Bayesian Course with Examples in R and Stan
SECOND EDITION
Richard McElreath
Second edition published 2020
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742
and by CRC Press
2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
2020 Taylor & Francis Group, LLC
First edition published by CRC Press 2015
CRC Press is an imprint of Taylor & Francis Group, LLC
Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact mpkbookspermissions@tandf.co.uk
Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.
Library of Congress Cataloging-in-Publication Data
Library of Congress Control Number:2019957006
ISBN: 978-0-367-13991-9 (hbk)
ISBN: 978-0-429-02960-8 (ebk)
Contents
It came as a complete surprise to me that I wrote a statistics book. It is even more surprising how popular the book has become. But I had set out to write the statistics book that I wish I could have had in graduate school. No one should have to learn this stuff the way I did. I am glad there is an audience to benefit from the book.
It consumed five years to write it. There was an initial set of course notes, melted down and hammered into a first 200-page manuscript. I discarded that first manuscript. But it taught me the outline of the book I really wanted to write. Then, several years of teaching with the manuscript further refined it.
Really, I could have continued refining it every year. Going to press carries the penalty of freezing a dynamic process of both learning how to teach the material and keeping up with changes in the material. As time goes on, I see more elements of the book that I wish I had done differently. Ive also received a lot of feedback on the book, and that feedback has given me ideas for improving it.
So in the second edition, I put those ideas into action. The major changes are:
The R package has some new tools. The map
tool from the first edition is still here, but now it is named quap
. This renaming is to avoid misunderstanding. We just used it to get a quadratic approximation to the posterior. So now it is named as such. A bigger change is that map2stan
has been replaced by ulam
. The new ulam
is very similar to map2stan
, and in many cases can be used identically. But it is also much more flexible, mainly because it does not make any assumptions about GLM structure and allows explicit variable types. All the map2stan
code is still in the package and will continue to work. But now ulam
allows for much more, especially in later chapters. Both of these tools allow sampling from the prior distribution, using extract.prior
, as well as the posterior. This helps with the next change.
Much more prior predictive simulation. A prior predictive simulation means simulating predictions from a model, using only the prior distribution instead of the posterior distribution. This is very useful for understanding the implications of a prior. There was only a vestigial amount of this in the first edition. Now many modeling examples have some prior predictive simulation. I think this is one of the most useful additions to the second edition, since it helps so much with understanding not only priors but also the model itself.
More emphasis on the distinction between prediction and inference. now, is also more direct in cautioning about the predictive nature of information criteria and cross-validation. Cross-validation and importance sampling approximations of it are now discussed explicitly.
New model types. , that focuses on models that are not easily conceived of as GLMMs, including ordinary differential equation models.
Some new data examples. There are some new data examples, including the Japanese cherry blossoms time series on the cover and a larger primate evolution data set with 300 species and a matching phylogeny.
More presentation of raw Stan models. There are many more places now where raw Stan model code is explained. I hope this makes a transition to working directly in Stan easier. But most of the time, working directly in Stan is still optional.
Kindness and persistence. As in the first edition, I have tried to make the material as kind as possible. None of this stuff is easy, and the journey into understanding is long and haunted. It is important that readers expect that confusion is normal. This is also the reason that I have not changed the basic modeling strategy in the book.