A Gentle Introduction to Stata
5th Edition
ALAN C. ACOCK Oregon State University
A Stata Press Publication StataCorp LP College Station, Texas
| Copyright 2006, 2008, 2010, 2012, 2014, 2016 by StataCorp LP |
All rights reserved. First edition 2006 |
Second edition 2008 |
Third edition 2010 |
Revised third edition 2012 |
Fourth edition 2014 |
Fifth edition 2016 |
Published by Stata Press, 4905 Lakeway Drive, College Station, Texas 77845
Typeset in L A T E X
Printed in the United States of America
Print ISBN-10: 1-59718-185-4
Print ISBN-13: 978-1-59718-185-3
ePub ISBN-10: 1-59718-186-2
ePub ISBN-13: 978-1-59718-186-0
Mobi ISBN-10: 1-59718-187-0
Mobi ISBN-13: 978-1-59718-187-7
Library of Congress Control Number: 2016935690
No part of this book may be reproduced, stored in a retrieval system, or transcribed, in any form or by any meanselectronic, mechanical, photocopy, recording, or otherwisewithout the prior written permission of StataCorp LP.
Stata, , Stata Press, Mata, , and NetCourse are registered trademarks of StataCorp LP.
Stata and Stata Press are registered trademarks with the World Intellectual Property Organization of the United Nations.
L A T E X is a trademark of the American Mathematical Society.
Contents
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
2.1
2.2
2.3
2.4
2.4.1
2.5
2.6
2.7
2.8
2.9
2.10
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
4.1
4.2
4.3
4.4
4.5
4.6
4.7
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
6.1
6.2
6.3
6.3.1
6.3.2
6.4
6.5
6.6
6.7
6.8
6.9
6.10
6.11
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.8.1
7.9
7.10
7.11
7.11.1
7.11.2
7.12
7.13
7.14
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
8.10
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
9.9
9.10
9.10.1
9.10.2
9.10.3
9.10.4
9.11
9.12
10.1
10.2
10.3
10.4
10.5
10.6
10.7
10.7.1
10.7.2
10.7.3
10.8
10.9
10.10
10.11
10.12
10.12.1
10.12.2
10.12.3
10.13
10.14
10.15
11.1
11.2
11.3
11.3.1
11.3.2
11.4
11.5
11.6
11.6.1
11.6.2
11.7
11.8
11.9
11.10
11.11
11.12
12.1
12.2
12.2.1
12.3
12.3.1
12.3.2
12.3.3
12.3.4
12.3.5
12.4
12.4.1
12.4.2
12.4.3
12.5
12.6
12.6.1
12.6.2
12.7
12.7.1
12.8
12.9
13.1
13.2
13.3
13.4
13.5
13.5.1
13.5.2
13.5.3
13.5.4
13.5.5
13.6
13.7
14.1
14.1.1
14.2
14.2.1
14.3
14.3.1
14.3.2
14.4
14.5
14.6
15.1
15.2
15.3
15.4
15.5
15.5.1
15.5.2
15.6
15.7
15.7.1
15.7.2
15.7.3
15.8
15.9
15.10
15.11
16.1
16.2
16.2.1
16.2.2
16.2.3
16.3
16.3.1
16.3.2
16.3.3
16.3.4
16.4
16.4.1
16.5
16.5.1
16.5.2
16.5.3
16.6
16.7
16.8
16.9
A
A.1
A.2
A.2.1
A.2.2
A.2.3
A.2.4
A.2.5
A.3
Figures
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
1.10
1.11
1.12
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
2.10
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
4.1
4.2
4.3
4.4
4.5
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
5.10
5.11
5.12
5.13
5.14
5.15
6.1
6.2
6.3
6.4
6.5
6.6
7.1
7.2
7.3
7.4
7.5
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
8.10
8.11
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
9.9
9.10
9.11
10.1
10.2
10.3
10.4
10.5
10.6
10.7
10.8
10.9
10.10
10.11
10.12
10.13
10.14
10.15
10.16
11.1
11.2
11.3
11.4
11.5
11.6
11.7
12.1
14.1
14.2
14.3
14.4
14.5
14.6
14.7
14.8
14.9
14.10
14.11
14.12
14.13
14.14
14.15
14.16
14.17
14.18
14.19
15.1
15.2
15.3
15.4
15.5
15.6
15.7
15.8
15.9
15.10
15.11
16.1
16.2
16.3
16.4
16.5
16.6
16.7
16.8
16.9
16.10
16.11
16.12
16.13
16.14
16.15
16.16
A.1
A.2
A.3
Tables
2.1
3.1
4.1
5.1
9.1
9.2
10.1
10.2
12.1
12.2
12.3
14.1
14.2
16.1
16.2
Boxed tips
Preface
This book was written with a particular reader in mind. This reader is learning socialstatistics and needs to learn Stata but has no prior experience with other statistical softwarepackages. When I learned Stata, I found there were no books written explicitly for this type ofreader. There are certainly excellent books on Stata, but they assume extensive priorexperience with other packages, such as SAS or IBM SPSS Statistics; they also assume a fairlyadvanced working knowledge of statistics. These books moved quickly to advanced topicsand left my intended reader in the dust. Readers who have more background instatistical software and statistics will be able to read chapters quickly and evenskip sections. The goal is to move the true beginner to a level of competence usingStata.
With this target reader in mind, I make far more use of the menus and dialog boxes inStatas interface than do any other books about Stata. Advanced users may not see the valuein using the interface, and the more people learn about Stata, the less they will rely on theinterface. Also, even when you are using the interface, it is still important to save a record ofthe sequence of commands you run. Although I rely on the commands much more than thedialog boxes in the interface in my own work, I still find value in the interface. The dialogboxes in the interface include many options that I might not have known or might haveforgotten.
To illustrate the interface as well as graphics, I have included more than 100 figures, manyof which show dialog boxes. I present many tables and extensive Stata results as theyappear on the screen. I interpret these results substantively in the belief that beginning Statausers need to learn more than just how to produce the resultsusers also need to be able tointerpret them.
I have tried to use real data. There are a few examples where it is much easier to illustratea point with hypothetical data, but for the most part, I use data that are in the publicdomain. For example, I use the General Social Surveys for 2002 and 2006 in many chapters, aswell as the National Survey of Youth, 1997. I have simplified the files by dropping many of thevariables in the original datasets, but I have kept all the observations. I have tried to useexamples from several social-science fields, and I have included a few extra variables in several datasets so that instructors, as well as readers, can make additional examples and exercises that are tailored to their disciplines. People who are used to working with statistics books that have contrived data with just a few observations, presumably so work can be done by hand, may be surprised to see more than 1,000 observations in this books datasets. Working with these files provides better experience for other real-world data analysis. If you have your own data and the dataset has a variety of variables, you may want to use your data instead of the data provided with this book.