STATISTICS
David J. Hand
New York / London
www.sterlingpublishing.com
STERLING and the distinctive Sterling logo are registered trademarks of Sterling Publishing Co., Inc.
Library of Congress Cataloging-in-Publication Data Available
10 9 8 7 6 5 4 3 2 1
Published by Sterling Publishing Co., Inc.
387 Park Avenue South, New York, NY 10016
Published by arrangement with Oxford University Press, Inc.
2008 by David J. Hand
Illustrated edition published in 2010 by Sterling Publishing Co., Inc.
Additional text 2010 Sterling Publishing Co., Inc.
Distributed in Canada by Sterling Publishing
c/o Canadian Manda Group, 165 Dufferin Street
Toronto, Ontario, Canada M6K 3H6
Book design: The Design Works Group
Please see picture credits on page 183 for image copyright information.
Printed in China
All rights reserved
Sterling ISBN 978-1-4027-7053-1
For information about custom editions, special sales, premium and corporate purchases, please contact Sterling Special Sales Department at 800-805-5489 or .
Frontispiece: A US Census Bureau employee operates a tabulation machine c. 1908. First used for the 1890 census, electric tabulation machines read punched cards to compile statistics.
CONTENTS
STATISTICAL IDEAS AND METHODS underlie just about every aspect of modern life. Sometimes the role of statistics is obvious, but often the statistical ideas and tools are hidden in the background. In either case, because of the ubiquity of statistical ideas, it is clearly extremely useful to have some understanding of them. The aim of this book is to provide such understanding.
Statistics suffers from an unfortunate but fundamental misconception which misleads people about its essential nature. This mistaken belief is that it requires extensive tedious arithmetic manipulation, and that, as a consequence, it is a dry and dusty discipline, devoid of imagination, creativity, or excitement. But this is a completely false image of the modern discipline of statistics. It is an image based on a perception dating from more than half a century ago. In particular, it entirely ignores the fact that the computer has transformed the discipline, changing it from one hinging around arithmetic to one based on the use of advanced software tools to probe data in a search for understanding and enlightenment. That is what the modern discipline is all about: the use of tools to aid perception and provide ways to shed light, routes to understanding, instruments for monitoring and guiding, and systems to assist decision-making. All of these, and more, are aspects of the modern discipline.
The aim of this book is to give the reader some understanding of this modern discipline. Now, clearly, in a book as short as this one, I cannot go into detail. Instead of detail, I have taken a high-level view, a birds eye view, of the entire discipline, trying to convey the nature of statistical philosophy, ideas, tools, and methods. I hope the book will give the reader some understanding of how the modern discipline works, how important it is, and, indeed, why it is so important.
The first chapter presents some basic definitions, along with illustrations to convey some of the power, importance, and, indeed, excitement of statistics. The second chapter introduces some of the most elementary of statistical ideas, ideas which the reader may well have already encountered, concerned with basic summaries of data. looks at just some of the ways the computer has impacted the discipline.
I would like to thank Emily Kenway, Shelley Channon, Martin Crowder, and an anonymous reader for commenting on drafts of this book. Their comments have materially improved it, and helped to iron out obscurities in the explanations. Of course, any such which remain are entirely my own fault.
David J. Hand
I MPERIAL C OLLEGE , L ONDON
A health worker plots disease and mortality statisticscommunity-based morbidity and mortality informationin a Sierra Leone village in 2007. Aid agencies rely on such information to treat third-world populations threatened by lethal diseases.
ONE
Surrounded by Statistics
To those who say there are lies, damned lies, and statistics, I often quote Frederick Mosteller, who said that it is easy to lie with statistics, but easier to lie without them.
Modern Statistics
I WANT TO BEGIN WITH AN ASSERTION that many readers might find surprising: statistics is the most exciting of disciplines. My aim in this book is to show you that this assertion is true and to show you why it is true. I hope to dispel some of the old misconceptions of the nature of statistics, and to show what the modern discipline looks like, as well as to illustrate some of its awesome power, as well as its ubiquity.
In particular, in this introductory chapter I want to convey two things. The first is a flavor of the revolution that has taken place in the past few decades. I want to explain how statistics has been transformed from a dry Victorian discipline concerned with the manual manipulation of columns of numbers, to a highly sophisticated modern technology involving the use of the most advanced of software tools. I want to illustrate how todays statisticians use these tools to probe data in the search for structures and patterns, and how they use this technology to peel back the layers of mystification and obscurity, revealing the truths beneath. Modern statistics, like telescopes, microscopes, X-rays, radar, and medical scans, enables us to see things invisible to the naked eye. Modern statistics enables us to see through the mists and confusion of the world about us, to grasp the underlying reality.
So that is the first thing I want to convey in this chapter: the sheer power and excitement of the modern discipline, where it has come from, and what it can do. The second thing I hope to convey is the ubiquity of statistics. No aspect of modern life is untouched by it. Modern medicine is built on statistics: for example, the randomized controlled trial has been described as one of the simplest, most powerful, and revolutionary tools of research. Understanding the processes by which plagues spread prevent them from decimating humanity. Effective government hinges on careful statistical analysis of data describing the economy and society: perhaps that is an argument for insisting that all those in government should take mandatory statistics courses. Farmers, food technologists, and supermarkets all implicitly use statistics to decide what to grow, how to process it, and how to package and distribute it. Hydrologists decide how high to build flood defenses by analyzing meteorological statistics. Engineers building computer systems use the statistics of reliability to ensure that they do not crash too often. Air traffic control systems are built on complex statistical models, working in real time. Although you may not recognize it, statistical ideas and tools are hidden in just about every aspect of modern life.
Next page