Contents
Statistics for Making Decisions
Nicholas T. Longford is a senior statistician in the Neonatal Data Analysis Unit at Imperial College London, England. His career includes appointments at the Educational Testing Service, Princeton, NJ, USA, De Montfort University, Leicester, England, and a spell of self-employment as the Director of SNTL Statistics Research and Consulting (UK and Spain). He is the author of six other monographs in statistics and the sole author or coauthor of over one hundred articles in peer-reviewed statistics journals. He is an editor of StatsRef (Wiley), reviewer for mathematical reviews and an associate editor of Statistical Methods in Medical Research and Journal of Educational and Behavioral Statistics.
First edition published 2021
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742
and by CRC Press
2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
2021 Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, LLC
The right of Nicholas Tibor Longford to be identified as author of this work has been asserted by him/her/them in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.
Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access
Library of Congress Control Number: 2020950454
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for identification and explanation without intent to infringe.
ISBN: 9780367342678 (hbk)
ISBN: 9780429324765 (ebk)
Typeset in Computer Modern font
by KnowledgeWorks Global Ltd.
We just ask them for the best estimates,
and then we make our decisions.
(Anon.)
As an aspiring academic statistician, I spent all my early efforts on modelling, firmly believing that discovering all the truths was only a matter of identifying suitable models and implementing methods for fitting them. The latter task was particularly challenging in the pre- Splus era of dinosaur-like mainframes and software with limited scope and capacity. My senior colleagues, at the Educational Testing Service in particular, planted in me the seed of the conviction that data cannot be taken for granted, and obtaining high-quality data tailored to what we want to know, in a timely and affordable manner, is the ultimate goal of statistics. In brief, design has a supremacy over, and far greater potential than modelling and data analysis. This led me to a renewed appreciation of survey design. And that is just an analytical version of a bus ride from methods for dealing with missing data, multiple imputation as a near-universal remedy and the EM algorithm as a powerful framework for thinking about problems for which we do not have an off-the-shelf solution. Soon after establishing myself as a statistical consultant I discovered the clients as a factor in how I should think, argue and operate, and how to respond to the profound mistrust and dissatisfaction with statistics interpreted narrowly and practiced as application of the correct hypothesis test. That is where I am now, as narrow-minded as ever, thinking that every problem in statistics is about making a decision, following decades of conviction that every problem was about finding the right (multilevel) model, that every problem was a missing-data problem, that everything is possible with a powerful computer (and R ), and so on.
Like countries have constitutions, and as do some professions, the medical and legal ones in particular, I put forward one for statistics, that it is extract
the science and profession of making purposeful decisions in the presence of uncertainty and with limited resources.
In this proposal, I want to emphasise that decision is the central intellectual activity in our everyday lives, both private and public, in business, public service and research. Assisting in this activity is at the core of the statistics profession when some uncertainty prevails and the information in our possession is incomplete. The limited resources in the statement is a reference to the importance of design, reflecting the expense and difficulty of collecting (primary or supplementary) information. Here, the resources should be interpreted broadly; they comprise not only funding, but also time, expertise, physical and manpower limitations and, not in the least, goodwill of the recruiting agent and the respondent.
Choice, in the context of a decision, is studied in several subject areas, foremost in economics. I want to distance myself from one general concern in these studies, namely the study of how human subjects choose from a set of available courses of action, and whether these choices are rational. My concern is solely with proposing or prescribing to subjects, called clients, how to choose, based on rules derived from and for the specific clients perspective, value judgements, priorities, or remit. That is, in its ideal form, the analysis starts by encoding the perspective, in the form of the harm, loss or damage done by selecting one course of action when another should have been selected. It then follows by evaluating the expected magnitude of this loss and concludes by electing the course of action that is least harmful, or most advantageous, in expectation.
Decision theory is the subject of several monographs and textbook chapters. DeGroot (1970), Berger (1985) and Lindley (1985) appeal to its general application, the latter with simple examples and with minimal mathematical equipment. More recent treatment of the subject is by Liese and Miescke (2008), Parmigiani and Inouye (2009), Longford (2013) and Peterson (2017). This book focuses on distinctly ordinary problems that have the following formulation. The client, who sponsors the analysis, contemplates a shortlist of courses of action (options), and the purpose of the analysis is to propose one of them that best fits the client's criterion of greatest benefit or least loss.
When there are only two options, hypothesis testing would seem to be appropriate. This we dismiss outright because the test has no means of incorporating the perspectives, value judgements and remits of the client. (The client payswe work for the client.) The methods dealt with in this book drop a lot, but not all, of the theoretical background and reduce the attention to settings encountered in statistical practice most frequently. Some assumptions may appear as too restrictive for the theoretician, but are constructive for the practically oriented analyst. We highlight the role of the client in the analysis and emphasise the subjective nature of the analysis.
Historically, decision theory and its applications have been firmly within the domain of the Bayesian paradigm. The development in this book is equally well suited for the frequentist and the Bayesian. It is intended not for