Editor
Zhengming Chen
Nuffield Department of Population Health, University of Oxford, Oxford, Oxfordshire, UK
ISBN 978-981-15-7665-2 e-ISBN 978-981-15-7666-9
https://doi.org/10.1007/978-981-15-7666-9
Springer Nature Singapore Pte Ltd. 2020
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Foreword
A large study is not just a small study made larger
There can be few books into which the scientific editor has put more decades of epidemiological preparation than Zhengming Chen has into this book or at least into the study that it describes. Now Professor of Epidemiology at Oxford, he began doing epidemiological studies in China more than 30 years ago. He has continued ever since, working on studies in China from a UK base and building in the process the largest collaboration in the world between Chinese and Western randomised and observational studies of population health. Each new study has been considerably larger than its predecessors.
Eventually, he and the team he had built up at Oxford were conducting, jointly with co-principal investigators and colleagues in China, rigorously randomised trials with several tens of thousands of participants of widely practicable treatments for common diseases. They were also conducting larger and larger observational epidemiological studies, culminating 15 years ago, as described in this book, in what was the worlds largest blood-based biobank study, with samples stored from half a million apparently healthy adults all over China, with electronic linkage to all deaths and to virtually all hospital treatment via the newly introduced nationwide health insurance scheme.
Fortuitously, this was just the moment when information technology, sample storage and retrieval, health record linkage, assay technology (genetic and non-genetic), and statistical methods had improved so much that, with detailed attention both to the organisation and to the science, a really large biobank study could succeed.
Equally fortuitously, a substantial one-off grant from the Kadoorie Charitable Foundation in Hong Kong, structural support from the Disease Surveillance Points system, and long-term support from Oxford University Departmental infrastructure gave Professor Chen and his colleagues the initial freedom to concentrate on optimising the planning, conduct, and maintenance of this, the first major biobank study of the new century.
Because it had been planned and executed so carefully and successfully, the China Kadoorie Biobank Study (CKB, which recruited 500,000 Chinese adults in 20042008) provided an influential model when, in the mid-2000s, a complete redesign was undertaken of the UK Biobank Study (UKB, which then successfully recruited 500,000 UK adults in 20082010).
CKB is now maintained by long-term support from major funding agencies in Beijing and London (with continued support from the Oxfords Nuffield Department of Population Health, which hosts both CKB and UKB), but the initial freedom offered by the early financial supporters of both studies was crucial to the careful planning and piloting that underlay their eventual efficiency and success.
Although these two biobank studies grew out of the twentieth-century tradition of prospective studies, in their methods and size they went far beyond it. In turn, they have provided methodological examples of successful use of twenty-first-century techniques that have inspired and influenced biobank studies elsewhere.
Currently, in a welcome development, all the major biobank studies in the world are communicating with each other, sharing methods, ideas, data, and results. This book can become part of the process of sharing methods, both with other studies and with the future. This matters, for as Rory Collins, chief executive officer and onlie begetter of UK Biobank has observed, a large study is not just a small study made larger.
In recent years vast numbers of scientific (and unscientific) articles have been written about the promise and problems of big biobank studies. Still, however, too little has been written by the few who have actually made such studies work reliably and productively. The interconnected problems, which need to be planned against, are partly organisational, partly technical (assay methods are improving so rapidly that it is often better to procrastinate, waiting for big decreases in price and increases in sensitivity), and partly statistical.
Statistical traps are laid by regression dilution (which can be avoided by appropriate use of periodic resurveys of a subsample of the study population), by unduly fine subgroup analyses, by random variation in measurements (as adjustment for imperfectly measured confounding factors can leave highly significant residual confounding), and by misleading relationships between different imperfectly measured factors that cannot be adequately resolved by multiple regression or by the recently fashionable directed acyclic graphs (DAGs). Another major problem can be reverse causality, but this can often be adequately dealt with by exclusion of those who already had disease at study entry (and, for some associations, exclusion of the first few years of follow-up).
Finally, as the randomised and the observational methods used so successfully in studies of physical disease get used increasingly widely in studies of mental disease, education, criminology, social policy, international development and many other issues, understanding the real problems that have been encountered and overcome in large prospective studies of the physical disease may be of increasingly wide interest.
Although a few highly entertaining moments have had to remain censored (and the Editor has East Asian flushing syndrome, so they cannot be elicited by alcohol), what remains is an account of a remarkable and influential study, relevant to the conduct and interpretation of all major prospective studies over the next decade or two.