An Introduction to Stata for Health
Researchers
Fifth Edition
SVEND JUUL
Department of Public Health
Section for Epidemiology
Aarhus University
Aarhus, Denmark
MORTEN FRYDENBERG
Department of Public Health
Section for Biostatistics
Aarhus University
Aarhus, Denmark
A Stata Press Publication
StataCorp LP
College Station, Texas
Copyright 2006, 2008, 2010, 2014 by StataCorp LP
All rights reserved. First edition 2006
Second edition 2008
Third edition 2010
Fourth edition 2014
Published by Stata Press, 4905 Lakeway Drive, College Station, Texas 77845
Typeset in LaTeX2e
Printed in the United States of America
10 9 8 7 6 5 4 3 2 1
Print ISBN-10: 1-59718-315-6
Print ISBN-13: 978-1-59718-315-4
ePub ISBN-10: 1-59718-316-4
ePub ISBN-13: 978-1-59718-316-1
Mobi ISBN-10: 1-59718-317-2
Mobi ISBN-13: 978-1-59718-317-8
Library of Congress Control Number: 2021933404
No part of this book may be reproduced, stored in a retrieval system, or transcribed, in any form or by any meanselectronic, mechanical, photocopy, recording, or otherwisewithout the prior written permission of StataCorp LLC.
Stata, , Stata Press, Mata, , and NetCourse are registered trademarks of StataCorp LLC.
Stata and Stata Press are registered trademarks with the World Intellectual Property Organization of the United Nations.
NetCourseNow is a trademark of StataCorp LLC.
LaTeX2e is a trademark of the American Mathematical Society.
Contents
1.1
1.2
1.3
1.4
1.5
1.6
1.7
2.1
2.2
2.3
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
3.10
3.11
4.1
4.2
4.3
4.4
4.5
4.6
5.1
5.2
5.3
6.1
6.2
7.1
7.2
7.3
7.4
7.5
7.6
8.1
8.2
8.3
8.4
8.5
9.1
9.2
9.3
9.4
10.1
10.2
10.3
10.4
10.5
10.6
10.7
11.1
11.2
11.3
11.4
11.5
11.6
11.7
11.8
12.1
12.2
12.3
12.4
12.5
12.6
12.7
13.1
13.2
13.3
14.1
14.2
14.3
15.1
15.2
15.3
15.4
15.5
15.6
15.7
16.1
16.2
16.3
16.4
16.5
16.6
16.7
16.8
16.9
16.10
16.11
16.12
17.1
17.2
17.3
17.4
17.5
Tables
Figures
Preface to the fifth edition
This fifth edition updates the fourth edition to reflect the changes in Stata 14, released in2015; Stata 15, released in 2017; Stata 16, released in 2019; and Stata 17, released in2021.
Since the fourth edition of the book, many nice things have happened with Stata, and many of thesechanges are also reflected in the book. In several ways, Stata has become more user-friendly, forexample, with an improved Do-file Editor.
Stata now has commands for working with both the 9th and the 10th releases of the InternationalClassification of Diseases, and we have included them in the book. We also included a chapter on themuch-improved commands for power, precision, and sample-size analysis. With release 14, Stataintroduced Unicode, giving the opportunity to use a wealth of characters beyond the Latin alphabet,and the consequences of that affect several sections in the book. With release 17, Stata introducedtools to tailor publication-ready tables, and we wrote an introduction to these tools. Finally, wemade a complete revision of the important (we think) chapter 9, Taking good care of your data .
This is an introductory book aimed at people working in health research, and we had to make severaldecisions about what to include and what to omit. If you miss something that is not described in thebook, it does not necessarily mean that Stata cannot do it. Use the help command and the PDF manuals to learn more.
During the process, Bill Rising and Kristin MacDonald at StataCorp gave several useful suggestions toimprove the quality of the book, and Lisa Gilmore coordinated everything.
User reactions are welcome and can be good inspiration for further improvements, so please feel freeto send comments to sj@ph.au.dk or mfstat@mollerfryd.dk.
Preface to the first edition
The main intent of this book is empowerment : I want to help you benefit from using Statain your own research. Your research is probably demanding enough as it is, but to manyresearchers, the technicalities of data management and analysis can cause major problemssometimes overwhelming problems. Stata has the tools you need. The purpose of this book is to helpyou use them.
Stata is a versatile program aimed at data management, statistical analysis, and graphics for research. Itis dynamic, too, with new and improved tools being added by Stata monthly and with contributionsfrom an enthusiastic user community daily. This rapid development pace may make the inexperienceduser feel a bit lost in what may initially look like a huge jungle. I want to help you becomefamiliar with the basics and benefit from some of the more advanced analytic tools. I will notbe able to demonstrate everything Stata can do, but I hope to help you get startedandmore.
This book is an introduction, written for the newcomer who has little or no experience with Stata. Butit will also be a valuable companion for more advanced users. Although I wrote the book to meet thenewcomers needs, I chose to build it systematically, for example, by putting everything aboutcalculations in one chapter, from the basics to the more complex stuff. This systematicstructure makes it easy to locate the information you need. Some of the exercises are aimed atbeginners.
The systematic approach also means that you should not try to understand or learn everything in thesequence in which it is presented, for example, in chapter 4 on command syntax. But now that youknow it is there, when you have a general question on Statas grammar, you can look in that chapter tofind the answer.
The books primary audience is people working with health research. When selecting which datamanagement and analysis tools to demonstrate, I chose the tools that, in my experience, are most oftenused in health research. But there is much more to it than what is shown in this book. Stata hashundreds of commands. I discuss a few of them and point to some other commands thatmight be useful to you. In addition to the official Stata commands, there are a thousanduser-generated contributions. I point to a few of them, too, and demonstrate how to find and use them.
Writing this book has been a joy (mostly). One of the best parts of the experience has been the enthusiastic discussions I have had with people at StataCorp. In particular, Alan Riley, Vince Wiggins, and Bennet Fauber have given a lot of useful input, and Terri Schroeder and Lisa Gilmore have skillfully prepared the manuscript for printing. The most important input, however, was from the students I taught and supervised.
If you believe that you have discovered an error, or if you have a suggestion for improving the book, please send an email to sj@soci.au.dk.