eBook - ePub

Introducing Anova and Ancova

Name: Introducing Anova and Ancova
ISBN: 9781446235959

A GLM Approach

Andrew Rutherford,

192 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Introducing Anova and Ancova

A GLM Approach

Andrew Rutherford,

About this book

Traditional approaches to ANOVA and ANCOVA are now being replaced by a General Linear Modeling (GLM) approach. This book begins with a brief history of the separate development of ANOVA and regression analyses and demonstrates how both analysis forms are subsumed by the General Linear Model. A simple single independent factor ANOVA is analysed first in conventional terms and then again in GLM terms to illustrate the two approaches.

The text then goes on to cover the main designs, both independent and related ANOVA and ANCOVA, single and multi-factor designs. The conventional statistical assumptions underlying ANOVA and ANCOVA are detailed and given expression in GLM terms.

Alternatives to traditional ANCOVA are also presented when circumstances in which certain assumptions have not been met. The book also covers other important issues in the use of these approaches such as power analysis, optimal experimental designs, normality violations and robust methods, error rate and multiple comparison procedures and the role of omnibus F-tests.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.

Yes, you can access Introducing Anova and Ancova by Andrew Rutherford in PDF and/or ePUB format, as well as other popular books in Social Sciences & Social Science Research & Methodology. We have over one million books available in our catalogue for you to explore.

Information

Publisher

SAGE Publications Ltd

Year

2000

Print ISBN

9780761951612, 9780761951605

eBook ISBN

9781446235959

Edition

Topic

Social Sciences

Subtopic

Social Science Research & Methodology

Index

Social Sciences

1	AN INTRODUCTION TO GENERAL LINEAR MODELS: REGRESSION, ANALYSIS OF VARIANCE AND ANALYSIS OF COVARIANCE

1.1 Regression, analysis of variance and analysis of covariance

Regression and analysis of variance are probably the most frequently applied of all statistical analyses. Regression and analysis of variance are used extensively in many areas of research, such as psychology, biology, medicine, education, sociology, anthropology, economics, political science, as well as in industry and commerce.

One reason for the frequency of regression and analysis of variance (ANOVA) applications is their suitability for many different types of study design. Although the analysis of data obtained from experiments is the focus of this text, both regression and ANOVA procedures are applicable to experimental, quasi-experimental and non-experimental data. Regression allows examination of the relationships between an unlimited number of predictor variables and a response or dependent variable, and enables values on one variable to be predicted from the values recorded on one or more other variables. Similarly, ANOVA places no restriction on the number of groups or conditions that may be compared, while factorial ANOVA allows examination of the influence of two or more independent variables or factors on a dependent variable. Another reason for the popularity of ANOVA is that it suits most effect conceptions by testing for differences between means.

Although the label analysis of covariance (ANCOVA) has been applied to a number of different statistical operations (Cox & McCullagh, 1982), it is most frequently used to refer to the statistical technique that combines regression and ANOVA. As the combination of these two techniques, ANCOVA calculations are more involved and time consuming than either technique alone. Therefore, it is unsurprising that greater availability of computers and statistical software is associated with an increase in ANCOVA applications. Although Fisher (1932; 1935) originally developed ANCOVA to increase the precision of experimental analysis, to date it is applied most frequently in quasi-experimental research. Unlike experimental research, the topics investigated with quasi-experimental methods are most likely to involve variables that, for practical or ethical reasons, cannot be controlled directly. In these situations, the statistical control provided by ANCOVA has particular value. Nevertheless, in line with Fisher’s original conception, many experiments can benefit from the application of ANCOVA.

1.2 A pocket history of regression, ANOVA and ANCOVA

Historically, regression and ANOVA developed in different research areas and addressed different questions. Regression emerged in biology and psychology towards the end of the 19th century, as scientists studied the correlation between people’s attributes and characteristics. While studying the height of parents and their adult children, Galton (1886; 1888) noticed that while short parents’ children usually were shorter than average, nevertheless, they tended to be taller than their parents. Galton described this phenomenon as “regression to the mean”. As well as identifying a basis for predicting the values on one variable from values recorded on another, Galton appreciated that some relationships between variables would be closer than others. However, it was three other scientists, Edgeworth (e.g. 1886), Pearson (e.g. 1896) and Yule (e.g. 1907), applying work carried out about a century earlier by Gauss (or Legendre, see Plackett, 1972), who provided the account of regression in precise mathematical terms. (Also see Stigler, 1986, for a detailed account.)

Publishing under the pseudonym “Student”, W.S. Gosset (1908) described the t-test to compare the means of two experimental conditions. However, as soon as there are more than two conditions in an experiment, more than one t-test is needed to compare all of the conditions and when more than one t-test is applied there is an increase in Type 1 error. (A Type 1 error occurs when a true null hypothesis is rejected.) In contrast, ANOVA, conceived and described by Ronald A. Fisher (1924, 1932, 1935) to assist in the analysis of data obtained from agricultural experiments, is able to compare the means of any number of experimental conditions without any increase in Type 1 error. Fisher (1932) also described a form of ANCOVA that provided an approximate adjusted treatment sum of squares, before he described the exact adjusted treatment sum of squares (Fisher, 1935, and see Cox & McCullagh, 1982, for a brief history). In early recognition of his work, the F-distribution was named after him by G.W. Snedecor (1934).

In the subsequent years, the techniques of regression and ANOVA were developed and applied in parallel by different groups of researchers investigating different research topics, using different research methodologies. Regression was applied most often to data obtained from correlational or non-experimental research and only regression analysis was regarded as trying to describe and predict dependent variable scores on the basis of a model constructed from the relations between predictor and dependent variables. In contrast, ANOVA was applied to experimental data beyond that obtained from agricultural experiments (Lovie, 1991), but still it was considered as just a way of determining whether the average scores of groups differed significantly. For many areas of psychology, where the interest (and so tradition) is to assess the average effect of different experimental conditions on groups of subjects in terms of a particular dependent variable, ANOVA was the ideal statistical technique. Consequently, separate analysis traditions evolved and encouraged the mistaken belief that regression and ANOVA constituted fundamentally different types of statistical analysis. Although ANCOVA illustrates the compatability of regression and ANOVA, as a combination of two apparently discrete techniques employed by different researchers working on different topics, unsurprisingly, it remains a much less popular method that is frequently misunderstood (Huitema, 1980).

1.3 An outline of general linear models (GLMs)

Computers, initially mainframe but increasingly PCs, have had considerable consequence for statistical analysis, both in terms of conception and implementation. From the 1980s, some of these changes began to filter through to affect the way data is analysed in the behavioural sciences. Indeed currently, descriptions of regression, ANOVA and ANCOVA found in psychology texts are in a state of flux, as alternative characterizations based on the general linear model are presented by more and more authors (e.g. Cohen & Cohen, 1983; Hays, 1994; Judd & McClelland, 1989; Keppel & Zedeck, 1989; Kirk, 1982, 1995; Maxwell & Delaney, 1990; Pedhazur, 1997; Winer, Brown & Michels, 1991).

One advantage afforded by computer based analyses is the easy use of matrix algebra. Matrix algebra offers an elegant and succinct statistical notation. Unfortunately however, human matrix algebra calculations, particularly those involving larger matrices, are not only very hard work, but also tend to be error prone. In contrast, computer implementations of matrix algebra are not only error free, but also computationally efficient. Therefore, most computer based statistical analyses employ matrix algebra calculations, but the program output usually is designed to accord with the expectations set by traditional (scalar algebra-variance partitioning) calculations.

When regression, ANOVA and ANCOVA are expressed in matrix algebra terms, a commonality is evident. Indeed, the same matrix algebra equation is able to summarize all three of these analyses. As regression, ANOVA and ANCOVA can be described in an identical manner, clearly they follow a common pattern. This common pattern is the GLM conception. Unfortunately, the ability of the same matrix algebra equation to describe regression, ANOVA and ANCOVA has resulted in the inaccurate identification of the matrix algebra equation as the GLM. However, just as a particular language provides a means of expressing an idea, so matrix algebra provides only one notation for expressing the GLM.

The GLM conception is that data may be accommodated in terms of a model plus some error, as illustrated below:

The model in this equation is a representation of our understanding or hypotheses about the data. The error component is an explicit recognition that there are other influences on the data. These influences are presumed to be unique for each subject in each experimental condition and include anything and everything not controlled in the experiment, such as chance fluctuations in behaviour. Moreover, the relative size of the model and error components is used to judge how well the model accommodates the data.

The model part of the GLM equation constitutes our understanding or hypotheses about the data and is expressed in terms of a set of variables recorded, like the data, as part of the study. As will be described, the tradition in data analysis is to use regression, ANOVA and ANCOVA GLMs to express different types of ideas about how data arises.

1.3.1 Regression analysis

Regression analysis attempts to explain data (the dependent variable scores) in terms of a set of independent variables or predictors (the model) and a residual component (error). Typically, a researcher who applies regression is interested in predicting a quantitative dependent variable from one or more quantitative independent variables, and in determining the relative contribution of each independent variable to the prediction: there is interest in what proportion of the variation in the dependent variable can be attributed to variation in the independent variable(s). Regression also may employ categorical (also known as nominal or qualitative) predictors: the use of independent variables such as sex, marital status and type of teaching method is common. Moreover, as regression is the elementary form of GLM, it is possible to construct regression GLMs equivalent to any ANOVA and ANCOVA GLMs by selecting and organizing quantitative variables to act as categorical variables (see Chapter 2). Nevertheless, the convention of referring to these particular quantitative variables as categorical variables will be maintained.

1.3.2 Analysis of variance

ANOVA also can be thought of in terms of a model plus error. Here, the dependent variable scores constitute the data, the experimental conditions constitute the model and the component of the data not accommodated by the model, again, is represented by the error term. Typically, the researcher applying ANOVA is interested in whether the mean dependent variable scores obtained in the experimental conditions differ significantly. This is achieved by determining how much variation in the dependent variable scores is attributable to differences between the scores obtained in the experimental conditions, and comparing this with the error term, which is attributable to variation in the dependent variable scores within each of the experimental conditions: there is interest in what proportion of variation in the dependent variable can be attributed to the manipulation of the experimental variable(s). Although the dependent variable in ANOVA is most likely to be measured on a quantitative scale, the statistical comparison is drawn between the groups of subjects receiving different experimental conditions and is categorical in nature, even when the experimental conditions differ along a quantitative scale. Therefore, ANOVA is a particular type of regression analysis that employs quantitative predictors to act as categorical predictors.

1.3.3 Analysis of covariance

As ANCOVA is the statistical technique that combines regression and ANOVA, it too can be described in terms of a model plus error. As in regression and ANOVA, the dependent variable scores constitute the data, but the model includes not only experimental conditions, but also one or more quantitative predictor variables. These quantitative predictors, known as covariates (also concomitant or control variables), represent sources of variance that are thought to influence the dependent variable, but have not been controlled by the experimental procedures. ANCOVA determines the covariation (correlation) between the covariate(s) and the dependent variable and then removes that variance associated with the covariate(s) from the dependent variable scores, prior to determining whether the differences between the experimental condition (dependent variable score) means are significant. As mentioned, this technique, in which the influence of the experimental conditions remains the major concern, but one or more quantitative variables that predict the dependent variable also are included in the GLM, is labelled ANCOVA most frequently, and in psychology is labelled ANCOVA exclusively (e.g. Cohen & Cohen, 1983; Pedhazur, 1997, cf. Cox & McCullagh, 1982). A very important, but seldom emphasized, aspect of the ANCOVA method is that the relationship between the covariate(s) and the dependent variable, upon which the adjustments depend, is determined empirically from the data.

1.4 The “general” in GLM

The term “general” in GLM simply refers to the ability to accommodate variables that represent both quantitative distinctions that represent continuous measures, as in regression analysis, and categorical distinctions that represent experimental conditions, as in ANOVA. This feature is emphasized in ANCOVA, where variables representing both quantitative and categorical distinctions are employed in the same GLM.

Traditionally, the label linear modelling was applied exclusively to regression analyses. However, as regression, ANOVA and ANCOVA are ...

Cover Page
Title
Copyright
Dedication
Contents
1 AN INTRODUCTION TO GENERAL LINEAR MODELS: REGRESSION, ANALYSIS OF VARIANCE AND ANALYSIS OF COVARIANCE
2 TRADITIONAL AND GLM APPROACHES TO INDEPENDENT MEASURES SINGLEFactor ANOVA DESIGNS
3 GLM APPROACHES TO INDEPENDENT MEASURES FACTORIAL ANOVA DESIGNS
4 GLM APPROACHES TO REPEATED MEASURES DESIGNS
5 GLM APPROACHES TO FACTORIAL REPEATED MEASURES DESIGNS
6 THE GLM APPROACH TO ANCOVA
7 ASSUMPTIONS UNDERLYING ANOVA, TRADITIONAL ANCOVA AND GLMS
8 SOME ALTERNATIVES TO TRADITIONAL ANCOVA
9 FURTHER ISSUES IN ANOVA AND ANCOVA
REFERENCES
INDEX