INTRODUCING ANOVA
AND ANCOVA
ISM | Introducing Statistical Methods |
Series editor: Daniel B. Wright, University of Sussex
This series provides accessible but in-depth introductions to statistical methods that are not covered in any detail in standard introductory courses. The books are aimed both at the beginning researcher who needs to know how to use a particular technique and the established researcher who wishes to keep up to date with recent developments in statistical data analysis.
Editorial board
Glynis Breakwell, University of Surrey
Jan de Leeuw, University of California, Los Angeles
Colm OMuircheartaigh, University of Chicago
Willem Saris, Universiteit van Amsterdam
Howard Schuman, University of Michigan
Karl van Meter, Centre National de la Recherche Scientifique, Paris
Other titles in this series
Introducing Multilevel Modeling
Ita Kreft and Jan de Leeuw
Introducing Social Networks
Alain Degenne and Michel Fors
Introducing LISREL: A Guide for the Uninitiated
Adamantios Diamantopoulos and Judy Siguaw
INTRODUCING ANOVA
AND ANCOVA
A GLM APPROACH
ANDREW RUTHERFORD
Andrew Rutherford 2001
First published 2001
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act, 1988, this publication may be reproduced, stored or transmitted in any form, or by any means, only with the prior permission in writing of the publishers, or in the case of reprographic reproduction, in accordance with the terms of licences issued by the Copyright Licensing Agency. Inquiries concerning reproduction outside those terms should be sent to the publishers.
| SAGE Publications Ltd 6 Bonhill Street London EC2A 4PU |
SAGE Publications Inc. 2455 Teller Road Thousand Oaks, California 91320 |
SAGE Publications India Pvt Ltd 32, M-Block Market Greater Kailash - I New Delhi 110 048 |
British Library Cataloguing in Publication data
A catalogue record for this book is available from the British Library
ISBN 0 7619 5160 1
ISBN 0 7619 5161 X (pbk)
Library of Congress catalog record available
Typeset by Keytec Typesetting Ltd
Printed in Great Britain by Athenaeum Press, Gateshead
To Patricia
CONTENTS
1 | AN INTRODUCTION TO GENERAL LINEAR MODELS: REGRESSION, ANALYSIS OF VARIANCE AND ANALYSIS OF COVARIANCE |
1.1 Regression, analysis of variance and analysis of covariance
Regression and analysis of variance are probably the most frequently applied of all statistical analyses. Regression and analysis of variance are used extensively in many areas of research, such as psychology, biology, medicine, education, sociology, anthropology, economics, political science, as well as in industry and commerce.
One reason for the frequency of regression and analysis of variance (ANOVA) applications is their suitability for many different types of study design. Although the analysis of data obtained from experiments is the focus of this text, both regression and ANOVA procedures are applicable to experimental, quasi-experimental and non-experimental data. Regression allows examination of the relationships between an unlimited number of predictor variables and a response or dependent variable, and enables values on one variable to be predicted from the values recorded on one or more other variables. Similarly, ANOVA places no restriction on the number of groups or conditions that may be compared, while factorial ANOVA allows examination of the influence of two or more independent variables or factors on a dependent variable. Another reason for the popularity of ANOVA is that it suits most effect conceptions by testing for differences between means.
Although the label analysis of covariance (ANCOVA) has been applied to a number of different statistical operations (Cox & McCullagh, 1982), it is most frequently used to refer to the statistical technique that combines regression and ANOVA. As the combination of these two techniques, ANCOVA calculations are more involved and time consuming than either technique alone. Therefore, it is unsurprising that greater availability of computers and statistical software is associated with an increase in ANCOVA applications. Although Fisher (1932; 1935) originally developed ANCOVA to increase the precision of experimental analysis, to date it is applied most frequently in quasi-experimental research. Unlike experimental research, the topics investigated with quasi-experimental methods are most likely to involve variables that, for practical or ethical reasons, cannot be controlled directly. In these situations, the statistical control provided by ANCOVA has particular value. Nevertheless, in line with Fishers original conception, many experiments can benefit from the application of ANCOVA.
1.2 A pocket history of regression, ANOVA and ANCOVA
Historically, regression and ANOVA developed in different research areas and addressed different questions. Regression emerged in biology and psychology towards the end of the 19th century, as scientists studied the correlation between peoples attributes and characteristics. While studying the height of parents and their adult children, Galton (1886; 1888) noticed that while short parents children usually were shorter than average, nevertheless, they tended to be taller than their parents. Galton described this phenomenon as regression to the mean. As well as identifying a basis for predicting the values on one variable from values recorded on another, Galton appreciated that some relationships between variables would be closer than others. However, it was three other scientists, Edgeworth (e.g. 1886), Pearson (e.g. 1896) and Yule (e.g. 1907), applying work carried out about a century earlier by Gauss (or Legendre, see Plackett, 1972), who provided the account of regression in precise mathematical terms. (Also see Stigler, 1986, for a detailed account.)
Publishing under the pseudonym Student, W.S. Gosset (1908) described the t-test to compare the means of two experimental conditions. However, as soon as there are more than two conditions in an experiment, more than one t-test is needed to compare all of the conditions and when more than one t-test is applied there is an increase in Type 1 error. (A Type 1 error occurs when a true null hypothesis is rejected.) In contrast, ANOVA, conceived and described by Ronald A. Fisher (1924, 1932, 1935) to assist in the analysis of data obtained from agricultural experiments, is able to compare the means of any number of experimental conditions without any increase in Type 1 error. Fisher (1932) also described a form of ANCOVA that provided an approximate adjusted treatment sum of squares, before he described the exact adjusted treatment sum of squares (Fisher, 1935, and see Cox & McCullagh, 1982, for a brief history). In early recognition of his work, the F-distribution was named after him by G.W. Snedecor (1934).
In the subsequent years, the techniques of regression and ANOVA were developed and applied in parallel by different groups of researchers investigating different research topics, using different research methodologies. Regression was applied most often to data obtained from correlational or non-experimental research and only regression analysis was regarded as trying to describe and predict dependent variable scores on the basis of a model constructed from the relations between predictor and dependent variables. In contrast, ANOVA was applied to experimental data beyond that obtained from agricultural experiments (Lovie, 1991), but still it was considered as just a way of determining whether the average scores of groups differed significantly. For many areas of psychology, where the interest (and so tradition) is to assess the average effect of different experimental conditions on groups of subjects in terms of a particular dependent variable, ANOVA was the ideal statistical technique. Consequently, separate analysis traditions evolved and encouraged the mistaken belief that regression and ANOVA constituted fundamentally different types of statistical analysis. Although ANCOVA illustrates the compatability of regression and ANOVA, as a combination of two apparently discrete techniques employed by different researchers working on different topics, unsurprisingly, it remains a much less popular method that is frequently misunderstood (Huitema, 1980).
Next page