A Course in Item Response Theory
and Modeling with Stata
Tenko Raykov
Michigan State University
George A. Marcoulides
University of California, Santa Barbara
A Stata Press Publication
StataCorp LLC
College Station, Texas
Copyright 2018 StataCorp LLC
All rights reserved. First edition 2018
Published by Stata Press, 4905 Lakeway Drive, College Station, Texas 77845
Typeset in L A T E X
Printed in the United States of America
Print ISBN-10: 1-59718-266-4
Print ISBN-13: 978-1-59718-266-9
ePub ISBN-10: 1-59718-267-2
ePub ISBN-13: 978-1-59718-267-6
Mobi ISBN-10: 1-59718-268-0
Mobi ISBN-13: 978-1-59718-268-3
Library of Congress Control Number: 2017957532
No part of this book may be reproduced, stored in a retrieval system, or transcribed, in any form or by any meanselectronic, mechanical, photocopy, recording, or otherwisewithout the prior written permission of StataCorp LLC.
Stata, , Stata Press, Mata, , and NetCourse are registered trademarks of StataCorp LLC.
Stata and Stata Press are registered trademarks with the World Intellectual Property Organization of the United Nations.
NetCourseNow is a trademark of StataCorp LLC.
L A T E X is a trademark of the American Mathematical Society.
Contents
1.1
1.2
1.3
1.4
2.1
2.1.1
2.1.2
2.2
2.2.1
2.2.2
2.3
2.3.1
2.3.2
2.4
3.1
3.1.1
3.1.2
3.1.3
3.2
3.3
3.3.1
3.3.2
3.3.3
3.4
4.1
4.1.1
4.1.2
4.1.3
4.2
4.2.1
4.2.2
4.3
4.3.1
4.3.2
5.1
5.1.1
5.2
5.2.1
5.2.2
5.2.3
5.3
5.4
5.5
5.5.1
5.5.2
5.5.3
5.5.4
5.5.5
5.5.6
5.6
5.7
5.7.1
5.7.2
5.8
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
7.1
7.2
7.2.1
7.2.2
7.2.3
7.2.4
7.3
7.3.1
7.4
7.5
7.6
7.7
8.1
8.2
8.3
8.5
8.6
8.7
8.8
9.1
9.2
9.3
9.4
9.5
10.1
10.2
10.3
10.4
10.5
10.6
11.1
11.2
11.3
11.3.1
11.4
11.4.1
11.4.2
11.5
11.6
11.7
11.8
11.9
11.10
12.1
12.2
12.3
12.4
12.5
12.5.1
12.5.2
12.6
Author index
Figures
1.1
1.2
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
5.1
5.2
5.3
5.4
5.5
6.1
6.2
6.3
6.4
6.5
6.6
6.7
8.1
8.2
8.3
8.4
8.5
8.6
8.7
9.1
9.2
9.4
Tables
10.1
11.1
12.1
Preface
More than half a century ago, a far-reaching revolution started in behavioral, educational, and social measurement, which to date has also had an enormous impact on a host of other disciplines ranging from biomedicine to marketing. At that time, item response theory ( IRT ) began finding its way into these sciences. In many respects, IRT quickly showed important benefits relative to the then conventional approach for developing measuring instruments that was based on classical procedures.
Since the 1950s and the influential early work by F. Lord in IRT (for example, ]), more than 60 years have passed that have been filled with major methodological advances in this field and more generally in behavioral and social measurement. The intervening decades have also witnessed an explosion of interest in IRT and item response modeling ( IRM ) across those disciplines as well as the clinical, biomedical, marketing, business, communication, and cognate sciences. These developments are also a convincing testament to the rich opportunities that this measurement approach offers to empirical scholars interested in assessing various latent constructs, traits, abilities, dimensions, or variables, as well as their interrelationships. The latent variables are only indirectly measurable, however, through their presumed manifestations in observed behavior. This is in particular possible via use of multiple indicators or multi-item measuring instruments, which have become highly popular in the behavioral and social sciences and well beyond them.
This book has been conceptualized mainly as an introductory to intermediate level discussion of IRT and IRM . To aid in the presentation, the book uses the software package Stata. This package offers, in addition to its recently developed IRT command, many and decisive benefits of general purpose statistical analysis and modeling software. After discussing fundamental concepts and relationships of special relevance to IRT , its applications in practical settings with Stata are illustrated using examples from the educational, behavioral, and social sciences. These examples can be readily translated, however, to similar utilizations of IRM also in the clinical, biomedical, business, marketing, and related disciplines.
We find that several features set our book apart from others currently available in the IRT field. One is that unlike a substantial number of treatments of IRT (in particular older ones), we capitalize on the diverse connections of this field to the comprehensive methodology of latent variable modeling as well as related applied statistics frameworks. In many aspects, it would be fair to view this book as predominantly handling IRT and IRM , somewhat informally stated, as part of the latent variable modeling methodology. In particular, the discussion throughout the book benefits as often as possible from the conceptual relationships between IRT and factor analysis, specifically, nonlinear factor analysis. Relatedly, whenever applicable, the important links between IRM and other statistical modeling approaches are also pointed out, such as the generalized linear model and especially logistic regression. Another distinguishing feature of the book is that it is free of misconceptions about and incorrect treatments of classical test theory (but not least, our book aims to provide a coherent discussion of IRT and IRM independently of software. The goal is thereby to highlight as often as possible and in as much detail as deemed necessary important concepts and relationships in IRT before moving on to its applications. This was necessary because in our experience, many individuals seem to find some features of this modeling approach more difficult to deal with and use to their advantage than what may be seen as conventional applied statistical concepts and relationships. These features include in particular the inherent nonlinearity in studied item-trait relationships as well as produced estimates (predictions) of individual trait levels and measures of uncertainty associated with them. That difficulty in appreciating characteristic properties of IRT may have arguably resulted from insufficient discussion and clarification of them in some alternative accounts or presentations.
This book could be considered aimed mainly at students and researchers with limited or no prior exposure to IRT . However, we are confident that it will also be of interest to more advanced students and scientists who are already familiar with IRT , in particular owing to the above mentioned features in which the book does not overlap with the majority of others available in this field. In addition, a main goal was to enable readers to pursue subsequently more advanced studies of this comprehensive and complex methodological field and its applications in empirical research, as well as to follow more technically oriented literature on IRT and IRM . Relatedly, even though the book uses primarily examples stemming from the educational and behavioral sciences, their treatment, as well as more generally of this measurement field, allows essentially straightforward applications of the used methods and procedures also in other social science settings. These include the clinical, nursing, psychiatry, biomedicine, criminology, organizational, marketing, and business disciplines (for example, ]).